Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Collapsing repetitive equality tests: || vs {} vs // (an observation)

by blogical (Pilgrim)
on Feb 25, 2007 at 10:02 UTC ( [id://601971]=perlmeditation: print w/replies, xml ) Need Help??

if ($_ eq 'A' || $_ eq 'B' || $_ eq 'C') { }
is the same as
if ({A =>1, B =>1, C =>1}->{$_}) { }
is the same as
if (/^(A|B|C)$/) { }
if (/^(A|B|C)\z/) { } or even better, if (/^(?:A|B|C)\z/) { } is the same as
my $chr = $_; grep { $chr eq $_ } qw( a b c );

Benchmarking, it seems the anonymous hash method is slightly faster than ||, but slower than /^()$/, which is also more terse and probably uses less memory. Regex wins again!

Editor's note: $ should be \z in regex as pointed out by ysth and ikegami. Also see ikegami's response for regex improvement and benchmark contestment.

Edit: Added diotalevi's grep suggestion for extra TMTOWTDI fun.

Replies are listed 'Best First'.
Re: Collapsing repetitive equality tests: || vs {} vs // (an observation)
by ikegami (Patriarch) on Feb 25, 2007 at 11:10 UTC

    if (/^(A|B|C)$/) { }
    should be
    if (/^(A|B|C)\z/) { }
    to be equivalent.

    And
    if (/^(A|B|C)\z/) { }
    would be faster if the needless capture was removed.
    if (/^(?:A|B|C)\z/) { }

    By the way, I don't trust "Benchmarking says ..." without seeing the test. They're too easy to do wrong. In fact, my benchmarks disagree with your findings.

    Findings:

    • Inlined hash is always the slowest, by far. Creating a hash, creating a reference to it and derefencing it is a lot of overhead. I don't know how your tests could have shown that it was faster than if.
    • Regular expression also has a lot of overhead. Static hash is consistently >20% faster.
    • Static hash is reliably fast.
    • Static hash is roughly 6 times (500% faster) the speed of its inlined counterpart.
    • if is fastest when the first or second test is true. It loses to static hash when 3 or more tests are needed.

    Summary results:

    if: ($var eq "A" || $var eq 'B' || $var eq 'C') ihash: { A=>1, B=>1, C=>1 }->{$var} shash: $hash{$var} re: $var =~ /^(?:A|B|C)\z/ Benchmarking no matches Rate ihash re if shash ihash 328339/s -- -79% -81% -86% re 1557274/s 374% -- -10% -33% if 1738308/s 429% 12% -- -25% shash 2307394/s 603% 48% 33% -- Benchmarking match A Rate ihash re shash if ihash 333900/s -- -78% -83% -88% re 1532032/s 359% -- -21% -43% shash 1946689/s 483% 27% -- -28% if 2707780/s 711% 77% 39% -- Benchmarking match B Rate ihash re shash if ihash 324969/s -- -78% -84% -84% re 1473476/s 353% -- -26% -28% shash 2004671/s 517% 36% -- -2% if 2053900/s 532% 39% 2% -- Benchmarking match C Rate ihash re if shash ihash 335514/s -- -76% -81% -83% re 1392243/s 315% -- -20% -28% if 1744756/s 420% 25% -- -9% shash 1920959/s 473% 38% 10% --

    Other possible tests:

    • Longer values.

    Benchmark tests:

    Complete results:

      You missed /^[ABC]\z/ which should be faster than an alternation (but is only viable with one letter options). I also wonder about length($str)==1 && $str=~/[ABC]/

      ---
      $world=~s/war/peace/g

        I took A, B and C to be placeholders for longer strings, so I purposefully didn't include the character class. I even mentioned I should have tested with longer strings as well.

        Feel free to add those tests and publish the results.

      Points taken, thank you. As for benchmarking, I did it on the command line, looping over a file of gobbledegook with a `perl -MBenchmark -e`. It was fairly barbaric, compared to what you've posted. I'd still post the exact code if I had it, but cygwin seems to have eaten it.
Re: Collapsing repetitive equality tests: || vs {} vs // (an observation)
by ysth (Canon) on Feb 25, 2007 at 10:14 UTC
    Almost the same. The regex will also return true for "A\n", "B\n", or "C\n". Try using \z instead of $.
Re: Collapsing repetitive equality tests: || vs {} vs // (an observation)
by diotalevi (Canon) on Feb 25, 2007 at 19:39 UTC

    You neglected to also look at the any predicate or its overworking cousin grep. Or perhaps Perl 6's any() disjunction or its (forward in history) "backport" any() from Quantum::Superpositions.

    my $chr = $_; # Naughty, naughty! You used $_ before but I need it now +. any { $chr eq $_ } qw( a b c ); grep { $chr eq $_ } qw( a b c );

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

      Feel free to drop a link to a reference to any, as I have no idea what you're referring to (sadly, I'm not hip to that Perl 6 jazz). grep, however, I can dig. It does make the test a bit large though.

      On a tangent, it would be interesting to be able to ask for the value of $_ "x back" so that we could say something like grep { $_{0} eq $_{1} } qw( a b c ) or grep { $_ eq $_{1} } qw( a b c )

        any can be found in List::MoreUtils.

        I was bored a while ago, and plugged both any and grep into ikegamis benchmark script. They're slow.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://601971]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-25 10:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found