Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^2: (Ab)using the Regex Engine

by jo37 (Pilgrim)
on May 26, 2020 at 10:41 UTC ( #11117272=note: print w/replies, xml ) Need Help??

in reply to Re: (Ab)using the Regex Engine
in thread (Ab)using the Regex Engine

A very good point that is ideed the ultimate answer to my question by guiding to Special Backtracking Control Verbs.

It states that (*FAIL) can be used to force the engine into backtracking and that this is equivalent to (?!). So version 2 and yours are basically the same and both are guaranteed to work. The trickery from version 3 is not needed.

So in the end it is "use" and not "abuse".



Replies are listed 'Best First'.
Re^3: (Ab)using the Regex Engine
by vr (Curate) on May 26, 2020 at 15:51 UTC

    I'd call it "abuse". My bet is this pattern of application is well-known and tolerated for the sake of critical mass of existing "cool examples of (ab)using re-engine", and therefore safe to use in the future :). Stand-alone (*F) is guaranteed to fail, there's no need to "force to backtrack" while staying in the same branch; and as there are no other branches in your example, the whole matching must have been optimized away. On the other hand, something like (?(?{CODE})(*F)), with CODE result depending on sub-matches so far, is legitimate use and another matter entirely, but not the case here.

    The impression is, aforementioned tolerance goes as far as injection of (*F) makes (but not always) engine fail to fail early, which is funny.

    my $match = qr[([ab]+)([ab]+)]; my $str = 'aba'; $str =~ /^ $match $ (?{ print "1: $1-$2\n" }) a /x; $str =~ /^ $match $ (?{ print "2: $1-$2\n" }) b /x; $str =~ /^ $match $ (?{ print "3: $1-$2\n" }) (*F) b /x; $str =~ /^ $match $ (?{ print "4: $1-$2\n" }) (*F) .. /x; __END__ 1: ab-a 1: a-ba 3: ab-a 3: a-ba

      Probably my statement in Re^2: (Ab)using the Regex Engine about "use" vs. "abuse" was unclear and I should have quoted the relevant section from perlre:

      (*FAIL) (*F) (*FAIL:arg)
      This pattern matches nothing and always fails. It can be used to force the engine to backtrack. It is equivalent to (?!), but easier to read. In fact, (?!) gets optimised into (*FAIL) internally. You can provide an argument so that if the match fails because of this FAIL directive the argument can be obtained from $REGERROR. It is probably useful only when combined with (?{}) or (??{}).
      My point was that I realized that
      (?{CODE})(?!) (?{CODE})(*F)
      are documented as equivalently forcing the engine to backtrack and are just what I was looking for. I don't call this "abuse", but YMMV.

      The example with a character class was just historical and the one with (??{CODE}) was a result of my own ignorance.



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11117272]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (2)
As of 2020-09-26 18:47 GMT
Find Nodes?
    Voting Booth?
    If at first I donít succeed, I Ö

    Results (142 votes). Check out past polls.