http://qs321.pair.com?node_id=1205956

Athanasius has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, and Merry Christmas!

I recently came across some old code in which I used a longish regular expression twice within a loop. So I thought, “Aha! here’s an opportunity for optimisation using qr//.” After all, the documentation (qr/STRING/msixpodualn under “Regexp Quote-Like Operators” in perlop) says:

Since Perl may compile the pattern at the moment of execution of the qr() operator, using qr() may have speed advantages in some situations ...

But the result was more than disappointing:

use strict; use warnings; use Benchmark qw( cmpthese timethese ); use constant TARGET => 1_389_019_170; my $r = timethese ( 5, { use_re => sub { my $ans1 = use_re(); $ans1 == TARGET or die $ans1; }, use_qr => sub { my $ans2 = use_qr(); $ans2 == TARGET or die $ans2; }, use_str => sub { my $ans3 = use_str(); $ans3 == TARGET or die $ans3; } } ); cmpthese $r; sub use_re { for (my $n = 1_010_101_030; $n <= 1_389_026_623; ) { my $s = $n * $n; return $n if $s =~ /^1\d2\d3\d4\d5\d6\d7\d8\d900$/; $n += 40; $s = $n * $n; return $n if $s =~ /^1\d2\d3\d4\d5\d6\d7\d8\d900$/; $n += 60; } die; } sub use_qr { my $re = qr/^1\d2\d3\d4\d5\d6\d7\d8\d900$/; for (my $n = 1_010_101_030; $n <= 1_389_026_623; ) { my $s = $n * $n; return $n if $s =~ $re; $n += 40; $s = $n * $n; return $n if $s =~ $re; $n += 60; } die; } sub use_str { my $str = '^1\d2\d3\d4\d5\d6\d7\d8\d900$'; for (my $n = 1_010_101_030; $n <= 1_389_026_623; ) { my $s = $n * $n; return $n if $s =~ /$str/; $n += 40; $s = $n * $n; return $n if $s =~ /$str/; $n += 60; } die; }

Typical output:

12:50 >perl 1846_SoPW.pl Benchmark: timing 5 iterations of use_qr, use_re, use_str... use_qr: 57 wallclock secs (53.19 usr + 0.06 sys = 53.25 CPU) @ 0 +.09/s (n=5) use_re: 22 wallclock secs (22.03 usr + 0.00 sys = 22.03 CPU) @ 0 +.23/s (n=5) use_str: 26 wallclock secs (25.81 usr + 0.00 sys = 25.81 CPU) @ 0 +.19/s (n=5) s/iter use_qr use_str use_re use_qr 10.7 -- -52% -59% use_str 5.16 106% -- -15% use_re 4.41 142% 17% -- 12:54 >

(I obtained similar results across my various 64-bit Strawberry Perl versions: 5.18.2, 5.20.2, 5.22.2, 5.24.0, and 5.26.0.)

I note in the documentation that the string returned by qr// “magically differs from a string containing the same characters”, so I’m guessing the additional overhead is due to the “magic” in some way, but I still find the result surprising. So, my questions:

Thanks,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,