Should we bother to save hash lookups for performance?

dlink has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Should we bother to save hash lookups for performance? by diotalevi (Canon) on Oct 18, 2002 at 16:57 UTC
It makes sense when your Benchmark tests say it's a problem but not prior. Consider that you're also making a copy of the data which might have it's own implications (unless you get Copy-On-Write (unless that's only a perl6ism)). I'd almost consider this sort of thing right along with the `my ($a);` vs `my $a;` optimization. I'd do it for notational convenience but not much else (unless you really do have thousands of entries in your hash and you really, really, really, really care about the pico seconds you're losing ...). Update: There is a difference between `my ($foo)` and `my $foo` - one has to include an extra OPCODE or two to coerce the scalar into list context while the other is already there. I'll leave it to you and B::Terse to figure out which one is "cheaper". `__SIG__ printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;` [download]	[reply] [d/l] [select]
Re: Re: Should we bother to save hash lookups for performance? by nothingmuch (Priest) on Oct 18, 2002 at 18:52 UTC
I would like to expand on the reasons for caching. If you are farmiliar with hashing, then you'll probably know that the key is processed in order to generate an index. Should this index be taken, most hash implementations will use a sub structure (probably a linked list, who's search time is O(N)) to store duplicate hashes of keys. Perl hashes probably differ, I don't know how, but I estimate the principal is similar. Perl hashes (not tied ones) can be evaluated in string context to see how many buckets are used and how many are allocated, for that particular hash. If you seem to be using a very small number of buckets out of the allocated ones, than you probably have an average lookup time slightly larger than O(1). Should that be the case, i'd suggest caching. Even so, I think that diotalevi's advice to `use Benchmark` is probably the best you'll get... Good luck! -nuffin zz zZ Z Z #!perl	[reply] [d/l]
Re: Re: Should we bother to save hash lookups for performance? by dlink (Novice) on Oct 18, 2002 at 17:28 UTC
Thanks. What ever looks clear notation-wize then, is the keeper.	[reply]
Re: Should we bother to save hash lookups for performance? by Aristotle (Chancellor) on Oct 18, 2002 at 20:20 UTC
If you don't need this for several variables at once, you can use `for` as a topicalizer: `sub do_somn { my $self = shift; for my $username ($self->{username}) { my $a = $username; my $b = $username; # ... } }` [download] `for` only aliases the variable to the value. Thus you also avoid the synchonization problems diotalevi mentioned. In fact I use this relatively often not for performance, but as a form of abstraction. I find `/x/ and $_ .= "y" for $self->{option};` more readable than `$self->{option} .= "y" if $self->{option} =~ /x/;` And it also adheres to the "do it once and only once" principle. If I change my mind about the hash key's name, I only have one place to update vs 2 (or 3 or 4 or 15..). Makeshifts last the longest.	[reply] [d/l] [select]
Re: Should we bother to save hash lookups for performance? (yes, sometimes) by grinder (Bishop) on Oct 19, 2002 at 19:36 UTC
For a single level of indirection, as shown in your example, no. (And if you do, don't use the variables $a and $b, leave them for sort comparison functions). On the other hand, for deeply nested hashes, yes, absolutely, assuming you're going to refer to the variable two or more times. It will make your code much more compact and easier to read. `my $bytes = $self->{center}{floor}{room}{bay}{rack}{unit}{port}{tran +smitted}; if( $bytes == 0 ) { print "Nothing transmitted.\n"; } elsif( $max < $bytes ) { $max = $bytes; } else { $sum += $bytes; } }` [download] Keep in mind that I'm not doing this for performance reasons, I'm only concerned about readability. print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'	[reply] [d/l]
Re: Re: Should we bother to save hash lookups for performance? (yes, sometimes) by BrowserUk (Patriarch) on Oct 19, 2002 at 20:15 UTC
This becomes especially useful when there exists and you need to access several things at a given level. `my $port = $self->{center}{floor}{room}{bay}{rack}{unit}{port}; unless ( --$port->{timecount} ) { unless ( $port->{transmitted} or $port->{received} ) { close( $port->{handle} ); } else { $port->{timecount} = $port->{timeout}; $port->{transmitted} = $port->{recieved} = 0; } }` [download] This neatly emulates the Pascal-style `with ...` statement. I guess you could even use `my $with_port = ...` but that would be altogether to cute (and extra to type:^). Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!	[reply] [d/l] [select]
Re^3: Should we bother to save hash lookups for performance? (yes, sometimes) by Aristotle (Chancellor) on Oct 19, 2002 at 20:29 UTC
`for($self->{center}{floor}{room}{bay}{rack}{unit}{port}) { last if --$_->{timecount}; close($_->{handle}), last unless $_->{transmitted} or $_->{received} $_->{timecount} = $_->{timeout}; $_->{transmitted} = $_->{recieved} = 0; }` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Re^3: Should we bother to save hash lookups for performance? (yes, sometimes) by BrowserUk (Patriarch) on Oct 19, 2002 at 20:59 UTC
Re^5: Should we bother to save hash lookups for performance? (yes, sometimes) by Aristotle (Chancellor) on Oct 19, 2002 at 21:05 UTC