Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: regex gotcha moving from 5.8.8 to 5.30.0?

by swl (Priest)
on Feb 10, 2021 at 07:20 UTC ( #11128173=note: print w/replies, xml ) Need Help??


in reply to regex gotcha moving from 5.8.8 to 5.30.0?

There is some repetition across your regexes that can be factored out. This maybe relates to the underlying cause.

Each regex starts with the same pattern: \s* ^ \s*. Checking for that before running the if conditions makes things about 250-260% faster under Strawberry perl 5.32, testing with a file of 500 begfoo sets generated using the code in 11128154. See code in sub parse_foo2. parse_foo1 is from the OP.

I also converted the condition to run in a while loop, mostly for style. The addition of the /aa flag makes a slight difference which could just be noise.

Note that I have not checked if all begfoo sets are parsed correctly...

I also don't have a version 5.8 to work with.

use 5.022; use warnings; use Benchmark qw {:all}; open my $fh, 'x.txt' or die; my $data = do {local $/ = undef; <$fh>}; cmpthese ( 10, { one => sub {parse_foo1($data)}, two => sub {parse_foo2($data)}, } ); sub parse_foo1 { my ($text) = @_; my $name; { last if $text =~ /\G \s* \Z/gcmsx; if ($text =~ /\G \s* ^ \s* begfoo \s+ (\S+?) \s* \( \s* (. +*?) \s* \) \s* ;/gcmsx) { $name = $1 } elsif ($text =~ /\G \s* ^ \s* endfoo /gcmsx) { } elsif ($text =~ /\G \s* ^ \s* \S+ \s+ .*? \s* ;/gcmsx) { } else { die "ERROR: unknown syntax\n" } redo; } print "LAST FOO1: $name\n"; } sub parse_foo2 { my ($text) = @_; my $name; while (not $text =~ /\G \s* \Z/gcmsx) { $text =~ /\G \s* /gcsmx; # march through any white space if ($text =~ /\G begfoo \s+ (\S+?) \s* \( \s* (.*?) \s* \) + \s* ;/gcmsxaa) { $name = $1 } elsif ($text =~ /\G endfoo /gcmsx) { } elsif ($text =~ /\G \S+ \s+ .*? \s* ;/gcmsx) { } else { die "ERROR: unknown syntax\n" } } print "LAST FOO2: $name\n"; }

Example results:

v5.32.0 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO1: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 LAST FOO2: FOO_500 Rate one two one 2.08/s -- -72% two 7.53/s 261% --

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11128173]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2021-04-14 07:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?