http://qs321.pair.com?node_id=1206016


in reply to Performance penalty of using qr//

G'day Athanasius,

I decided to look at this from a slightly different angle. Instead of involving complex routines and matches, I picked a regex match that was so simple that an 'eq' comparison would be, in normal code, a better choice:

'X' =~ /^X$/

I ran this benchmark, using that match as a base, but incorporating a whole series of variations: with/without 'qr//', the 'o' modifier, and many types of variables.

#!/usr/bin/env perl use 5.010; use strict; use warnings; use Benchmark 'cmpthese'; use constant STRING => 'X'; use constant CONST_RE => qr{^X$}; my $my_re = qr{^X$}; state $state_re = qr{^X$}; our $our_re = qr{^X$}; local $main::local_re = qr{^X$}; cmpthese 0 => { re_str => sub { STRING =~ '^X$' }, re_re => sub { STRING =~ /^X$/ }, re_re_o => sub { STRING =~ /^X$/o }, re_qr => sub { STRING =~ qr{^X$} }, re_qr_o => sub { STRING =~ qr{^X$}o }, qr_my => sub { STRING =~ $my_re }, qr_my_re => sub { STRING =~ /$my_re/ }, qr_my_re_o => sub { STRING =~ /$my_re/o }, qr_state => sub { STRING =~ $state_re }, qr_state_re => sub { STRING =~ /$state_re/ }, qr_state_re_o => sub { STRING =~ /$state_re/o }, qr_our => sub { STRING =~ $our_re }, qr_our_re => sub { STRING =~ /$our_re/ }, qr_our_re_o => sub { STRING =~ /$our_re/o }, qr_const => sub { STRING =~ CONST_RE }, qr_const_re => sub { STRING =~ /${\CONST_RE()}/ }, qr_const_re_o => sub { STRING =~ /${\CONST_RE()}/o }, qr_local => sub { STRING =~ $main::local_re }, qr_local_re => sub { STRING =~ /$main::local_re/ }, qr_local_re_o => sub { STRING =~ /$main::local_re/o }, };

[I was aware that "qr_const_re" and "qr_const_re_o" might produce bogus results due to the additional reference and dereference operations; however, I left them in purely out of curiosity.]

Here's the results just showing the rates. (The complete results are in a spoiler at the end of my post.)

Rate re_qr 550709/s re_qr_o 560597/s qr_local 1061718/s qr_const_re 1065053/s qr_state 1065891/s qr_local_re 1077507/s qr_our 1089135/s qr_state_re 1089138/s qr_my_re 1092539/s qr_my 1096982/s qr_our_re 1101420/s qr_state_re_o 4073421/s qr_local_re_o 4085064/s qr_my_re_o 4279146/s qr_const_re_o 4293130/s qr_our_re_o 4302931/s re_re_o 4647519/s qr_const 4745450/s re_re 4748039/s re_str 4814042/s

Some of those numbers are too close to call with respect to what was faster than what; however, this general trend appears to emerge (fastest to slowest):

  1. '^X$', /^X$/ and, when created at compile time, qr{^X$}.
  2. All that used the 'o' modifier (except for qr{^X$}o).
  3. Those using variables assigned with qr{^X$}; either as $var or /$var/.
  4. Clearly the slowest of all: qr{^X$} and qr{^X$}o (created at runtime).

I ran that a few times. The specific order changed a bit but the general trend I've indicated seemed to hold.

In light of ++dave_the_m's input, I'd like to see how captures might affect those results. I don't have time to do this myself now: perhaps you, or someone else, would care to tinker.

The full results are in the spoiler, below. I needed to stretch the console window to 220 characters to avoid wrapping.

— Ken