Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^3: Performance of possessive quantifiers

by hipowls (Curate)
on Jan 27, 2008 at 22:55 UTC ( [id://664577]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Performance of possessive quantifiers
in thread Performance of possessive quantifiers

Good question, I suspect that it is an optimization. Trying

my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; sub atomic { use re 'debug'; say 'Matched $re+\d' if $str =~ m/$re+\d/; } sub possessive { use re 'debug'; say 'Matched $re++\d' if $str =~ m/$re++\d/; } atomic(); possessive();
The regexes compile to
# atomic Compiling REx "(?-xism:(?:be|ea|a))+\d" Final program: 1: CURLYX[0] {1,32767} (14) 3: TRIE-EXACT[abe] (13) <be> <ea> <a> 13: WHILEM[1/1] (0) 14: NOTHING (15) 15: DIGIT (16) 16: END (0) minlen 2 #possessive Compiling REx "(?-xism:(?:be|ea|a))++\d" Final program: 1: SUSPEND (19) 3: CURLYX[0] {1,32767} (16) 5: TRIE-EXACT[abe] (15) <be> <ea> <a> 15: WHILEM[1/1] (0) 16: NOTHING (17) 17: SUCCEED (0) 18: TAIL (19) 19: DIGIT (20) 20: END (0) minlen 2 # regex.ato and regex.pos contain the output including # the Compiling REx message michael@lnx-main:working> wc regex.ato regex.pos 10650 70471 1837158 regex.ato 229368 1989603 47988305 regex.pos
As you can see it is doing a lot more. I don't know why but it will be interesting to find out;)

Update 1: It may be something to do with caching. Looking in the re debug output I find

michael@lnx-main:working> grep Detected regex.ato whilem: Detected a super-linear match, switching on caching... michael@lnx-main:working> grep Detected regex.pos whilem: Detected a super-linear match, switching on caching... # Now look for this line # whilem: (cache) already tried at this position... michael@lnx-main:working> grep '(cache)' regex.ato | wc 298 2086 26426 michael@lnx-main:working> grep '(cache)' regex.pos | wc 0 0 0
At this point I am prepared to confess that I will struggling a bit to dig in deeper;-)

Update 2: The SUSPEND/TAIL pair look like where any interested monk should start. In perldebguts I found this

# Do nothing NOTHING no Match empty string. # A variant of above which delimits a group, thus stops optimizations TAIL no Match empty string. Can jump here from outside.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://664577]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-26 04:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found