Ever since I read in
Mastering Regular Expressions that perl makes a copy of the base string when doing a case-insensitive match, I've tried to use character classes instead of /i.
... Before submitting this post, though, I decided to actually benchmark some variations to see whether character classes were faster. To my surprise, it turns out that /i is about 50% faster in the test I used:
use strict;
use Benchmark qw(cmpthese);
my $foo = "abcdefghijklmnopqrstuvwxyz"x500;
my $re = "[Aa][Bb][Cc]";
cmpthese(1000000, {
'i' => sub { $foo =~ /abc/ig },
'chars' => sub { $foo =~ /[Aa][Bb][Cc]/og },
'charvar' => sub { $foo =~ /$re/og },
});
yielding these results on my machine:
Benchmark: timing 1000000 iterations of chars, charvar, i...
chars: 2 wallclock secs ( 1.97 usr + 0.00 sys = 1.97 CPU) @ 50
+7614.21/s (n=1000000)
charvar: 3 wallclock secs ( 2.04 usr + -0.01 sys = 2.03 CPU) @ 49
+2610.84/s (n=1000000)
i: 1 wallclock secs ( 1.31 usr + 0.00 sys = 1.31 CPU) @ 76
+3358.78/s (n=1000000)
Rate charvar chars i
charvar 492611/s -- -3% -35%
chars 507614/s 3% -- -34%
i 763359/s 55% 50% --
Results are similar for strings of various lengths.
So was
Mastering Regular Expressions incorrect, or has the problem just been fixed since it was written?