a little harder to get right first time
hehe, I guess so. Your while condition doesn't fail until *after* we've printed out the substr using a value of -1 for $p. Therefore you get a phantom match of '324' given the sample input.
You might also be surprised at how this benchmarks against a well crafted regex. The regex engine has some clever optimizations under the hood.
This benchmark surprised me as well... I tossed in a sexegersolution that I thought would perform well, since we are looking for stuff in front of a known character.
Anyway, it didn't perform as well as either of the other solutions, but the regex did win the race:
Benchmark: running regexpShort, sexegeShort, substrShort, each for at
+least 3 CPU seconds...
regexpShort: 4 wallclock secs ( 3.28 usr + 0.00 sys = 3.28 CPU) @ 4
+6935.67/s (n=153949)
sexegeShort: 5 wallclock secs ( 3.04 usr + 0.00 sys = 3.04 CPU) @ 2
+7424.67/s (n=83371)
substrShort: 4 wallclock secs ( 3.05 usr + 0.00 sys = 3.05 CPU) @ 3
+1047.21/s (n=94694)
Rate sexegeShort substrShort regexpShort
sexegeShort 27425/s -- -12% -42%
substrShort 31047/s 13% -- -34%
regexpShort 46936/s 71% 51% --
Benchmark: running regexpLong, sexegeLong, substrLong, each for at lea
+st 3 CPU seconds...
regexpLong: 3 wallclock secs ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 59
+0.31/s (n=1889)
sexegeLong: 4 wallclock secs ( 3.38 usr + 0.00 sys = 3.38 CPU) @ 31
+0.36/s (n=1049)
substrLong: 5 wallclock secs ( 3.09 usr + 0.00 sys = 3.09 CPU) @ 46
+2.14/s (n=1428)
Rate sexegeLong substrLong regexpLong
sexegeLong 310/s -- -33% -47%
substrLong 462/s 49% -- -22%
regexpLong 590/s 90% 28% --
And here is the Benchmark code
#!/usr/bin/perl -w
use strict;
use Benchmark qw(cmpthese);
my $varshort = "abc:12345 def:54321 ghi:13245";
my $varlong = "$varshort " x 120;
# subs
sub regex {
my $str = shift;
my @arr = ($str =~ /(.{3}):/g);
}
sub substring {
my $str = shift;
my @arr;
my $p = 0;
push(@arr,substr( $str, ($p=index($str, ':', $p+1 ))-3,3)) while $p
+> -1;
pop(@arr);
return @arr;
}
sub sexeger {
my $str = reverse shift;
my @arr = reverse
map {$_ = reverse $_}
($str =~ /:(.{3})/g);
}
sub regexpShort { regex($varshort) }
sub regexpLong { regex($varlong) }
sub sexegeShort { sexeger($varshort) }
sub sexegeLong { sexeger($varlong) }
sub substrShort { substring($varshort) }
sub substrLong { substring($varlong) }
# unit tests
my $rs = "@{[regexpShort()]}";
my $rl = "@{[regexpLong()]}";
my $ss = "@{[sexegeShort()]}";
my $sl = "@{[sexegeLong()]}";
my $bs = "@{[substrShort()]}";
my $bl = "@{[substrLong()]}";
die unless $rs eq $ss;
die unless $rs eq $bs;
die unless $rl eq $sl;
die unless $rl eq $bl;
# benchmark
cmpthese(-3,
{
regexpShort => \®expShort,
substrShort => \&substrShort,
sexegeShort => \&sexegeShort,
}
);
cmpthese(-3,
{
regexpLong => \®expLong,
substrLong => \&substrLong,
sexegeLong => \&sexegeLong,
}
);
-Blake