Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Should we bother to save hash lookups for performance?

by dlink (Novice)
on Oct 18, 2002 at 16:44 UTC ( [id://206354]=perlquestion: print w/replies, xml ) Need Help??

dlink has asked for the wisdom of the Perl Monks concerning the following question:

If your perl object is large and the blessed hash has many members does it makes sense to store the value of a hash lookup in a my variable for further usage to save the lookup time. Or is this unnessary micro optimization?

eq. This: sub sub do_somn { my $self = shift; my $username = $self->{username}; my $a = $username; # for some reason my $b = $username; # for some other reason. ... } Verse this: sub do_somn { my $self = shift; my $a = $self->{username}; # for some reason my $b = $self->{username}; # for some other reason. ... }

Replies are listed 'Best First'.
Re: Should we bother to save hash lookups for performance?
by diotalevi (Canon) on Oct 18, 2002 at 16:57 UTC

    It makes sense when your Benchmark tests say it's a problem but not prior. Consider that you're also making a copy of the data which might have it's own implications (unless you get Copy-On-Write (unless that's only a perl6ism)). I'd almost consider this sort of thing right along with the my ($a); vs my $a; optimization. I'd do it for notational convenience but not much else (unless you really do have thousands of entries in your hash and you really, really, really, really care about the pico seconds you're losing ...).

    Update: There is a difference between my ($foo) and my $foo - one has to include an extra OPCODE or two to coerce the scalar into list context while the other is already there. I'll leave it to you and B::Terse to figure out which one is "cheaper".

    __SIG__ printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;
      I would like to expand on the reasons for caching. If you are farmiliar with hashing, then you'll probably know that the key is processed in order to generate an index. Should this index be taken, most hash implementations will use a sub structure (probably a linked list, who's search time is O(N)) to store duplicate hashes of keys.
      Perl hashes probably differ, I don't know how, but I estimate the principal is similar.
      Perl hashes (not tied ones) can be evaluated in string context to see how many buckets are used and how many are allocated, for that particular hash. If you seem to be using a very small number of buckets out of the allocated ones, than you probably have an average lookup time slightly larger than O(1). Should that be the case, i'd suggest caching.

      Even so, I think that diotalevi's advice to use Benchmark is probably the best you'll get...

      Good luck!

      -nuffin
      zz zZ Z Z #!perl
      Thanks. What ever looks clear notation-wize then, is the keeper.
Re: Should we bother to save hash lookups for performance?
by Aristotle (Chancellor) on Oct 18, 2002 at 20:20 UTC
    If you don't need this for several variables at once, you can use for as a topicalizer:
    sub do_somn { my $self = shift; for my $username ($self->{username}) { my $a = $username; my $b = $username; # ... } }

    for only aliases the variable to the value. Thus you also avoid the synchonization problems diotalevi mentioned.

    In fact I use this relatively often not for performance, but as a form of abstraction. I find

    /x/ and $_ .= "y" for $self->{option}; more readable than $self->{option} .= "y" if $self->{option} =~ /x/; And it also adheres to the "do it once and only once" principle. If I change my mind about the hash key's name, I only have one place to update vs 2 (or 3 or 4 or 15..).

    Makeshifts last the longest.

Re: Should we bother to save hash lookups for performance? (yes, sometimes)
by grinder (Bishop) on Oct 19, 2002 at 19:36 UTC

    For a single level of indirection, as shown in your example, no. (And if you do, don't use the variables $a and $b, leave them for sort comparison functions).

    On the other hand, for deeply nested hashes, yes, absolutely, assuming you're going to refer to the variable two or more times. It will make your code much more compact and easier to read.

    my $bytes = $self->{center}{floor}{room}{bay}{rack}{unit}{port}{tran +smitted}; if( $bytes == 0 ) { print "Nothing transmitted.\n"; } elsif( $max < $bytes ) { $max = $bytes; } else { $sum += $bytes; } }

    Keep in mind that I'm not doing this for performance reasons, I'm only concerned about readability.


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

      This becomes especially useful when there exists and you need to access several things at a given level.

      my $port = $self->{center}{floor}{room}{bay}{rack}{unit}{port}; unless ( --$port->{timecount} ) { unless ( $port->{transmitted} or $port->{received} ) { close( $port->{handle} ); } else { $port->{timecount} = $port->{timeout}; $port->{transmitted} = $port->{recieved} = 0; } }

      This neatly emulates the Pascal-style with ... statement.

      I guess you could even use my $with_port = ... but that would be altogether to cute (and extra to type:^).


      Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
        for($self->{center}{floor}{room}{bay}{rack}{unit}{port}) { last if --$_->{timecount}; close($_->{handle}), last unless $_->{transmitted} or $_->{received} $_->{timecount} = $_->{timeout}; $_->{transmitted} = $_->{recieved} = 0; }

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://206354]
Approved by chip
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-18 01:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found