Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: "exists $hash{key}" is slower than "$hash{key}"

by choroba (Cardinal)
on Jan 06, 2020 at 01:01 UTC ( [id://11111018]=note: print w/replies, xml ) Need Help??


in reply to "exists $hash{key}" is slower than "$hash{key}"

Unfortunately, you can't access the lexical hash variable from a string-evaled code in Benchmark, so the benchmarks measures nothing.
#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Benchmark qw{ cmpthese }; my %hl; @hl{1001 .. 2000} = (1) x 1000; our %hg = %hl; our ($el, $vl, $eg, $vg) = (0) x 4; die unless $hl{1001}; die unless $hg{1001}; cmpthese(-1, { exist_l => q{ ++$el if exists $hl{1001} }, value_l => q{ ++$vl if $hl{1001} }, exist_g => q{ ++$eg if exists $hg{1001} }, value_g => q{ ++$vg if $hg{1001} }, }); say join "\n", "el: $el", "vl: $vl", "eg: $eg", "vg: $vg";
5.26.1 on Linux:
Rate value_g exist_g value_l exist_l value_g 17625636/s -- -4% -47% -53% exist_g 18303545/s 4% -- -45% -51% value_l 33363781/s 89% 82% -- -11% exist_l 37417325/s 112% 104% 12% -- el: 0 vl: 0 eg: 23229990 vg: 23229990
Blead gives smaller differences, but the order is the same.

Update: Changing q{ to sub { changes the order randomly and makes the difference less than 10%.

Update 2: Changing 1001 to 2001 in the benchmarked code (i.e. testing non-existent key) changes the differences to 20% and less, e.g.

Rate value_g exist_g value_l exist_l value_g 31977080/s -- -2% -8% -19% exist_g 32733979/s 2% -- -6% -17% value_l 34737228/s 9% 6% -- -12% exist_l 39658833/s 24% 21% 14% --

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Replies are listed 'Best First'.
Re^2: "exists $hash{key}" is slower than "$hash{key}"
by tobyink (Canon) on Jan 06, 2020 at 21:41 UTC

    If you want to benchmark code using lexical variables, but need to ameliorate the overhead of calling a sub, usually the easiest way is to wrap the code you're benchmarking in a loop that gets executed thousands of times...

    #!/usr/bin/perl use warnings; use strict; use Benchmark qw{ cmpthese }; my %h; @h{1001 .. 2000} = (1) x 1000; my ($e, $v) = (0) x 2; cmpthese(-1, { exist => sub { for (0..999_999) { ++$e if exists $h{1001} } }, value => sub { for (0..999_999) { ++$v if $h{1001} } }, }); __DATA__ Rate value exist value 18.3/s -- -5% exist 19.3/s 5% --

      Thanks for this. The interesting thing is that if I stringify the code to avoid the sub call then I can replicate the ordering from my original post.

      I added a false value to get a sense of the cost of the increment operation. value_true is value from your version, and is obviously faster as it does less work.

      use warnings; use strict; use Benchmark qw{ cmpthese }; my %h; @h{1001 .. 2000} = (1) x 1000; my ($e, $vt, $vf) = (0) x 3; $h{1002} = 0; cmpthese(-2, { exist => sub { for (0..999_999) { ++$e if exists $h{1001} } }, exist_str => 'for (0..999_999) { ++$e if exists $h{1001} }', value_true => sub { for (0..999_999) { ++$vt if $h{1001} } }, value_str => 'for (0..999_999) { ++$vt if $h{1001} }', value_false => sub { for (0..999_999) { ++$vf if $h{1002} } }, }); __DATA__ Rate exist value_true exist_str value_str val +ue_false exist 16.1/s -- -9% -23% -26% + -27% value_true 17.7/s 10% -- -15% -18% + -20% exist_str 20.9/s 29% 18% -- -4% + -6% value_str 21.7/s 34% 22% 4% -- + -2% value_false 22.1/s 37% 25% 6% 2% + --
        > if I stringify the code to avoid the sub call then I can replicate the ordering from my original post.

        I'm not sure if you got the point, that accessing an outer lexical variable inside a string is not possible.

        Did you?

        The Benchmark module is eventually running the code's text thru eval , but inside it's own scope.

        The my ($e, $vt, $vf) = (0) x 3; are not accessible from there.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re^2: "exists $hash{key}" is slower than "$hash{key}"
by swl (Parson) on Jan 07, 2020 at 00:10 UTC

    Thanks for this.

    I'm not sure why not being able to access the lexical matters. My objective is simply to isolate code as much as possible so the relative differences are down to the exists/value checks. That said, I used the globals because I hit an issue with incrementing package lexicals in the stringified code when it was run by the benchmark module, but probably just did something incorrect when setting those up.

    The effect of the non-existent key is interesting. I'll add it to my benchmarks.

    WRT sub calls, there seem to still be overheads that affect the timings (or my benchmarks need more work) - see my response to tobyink at 11111089.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11111018]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-24 19:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found