Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

What does it mean that a "pattern cannot be reversed?"

by tphyahoo (Vicar)
on Jan 06, 2005 at 17:42 UTC ( [id://420007]=perlquestion: print w/replies, xml ) Need Help??

tphyahoo has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I was looking at Synopsis 5, which is the intro to perl 6 regexes, or as they are now called, "rules." We have here

# The special named assertions include: / <before pattern> / # was /(?=pattern)/ / <after pattern> / # was /(?<pattern)/ / <ws> / # match whitespace by :w rules / <sp> / # match a space char The after assertion implements lookbehind by reversing the syntax tree + and looking for things in the opposite order going to the left. It i +s illegal to do lookbehind on a pattern that cannot be reversed.

What does it mean that a pattern cannot be reversed?

Does this have anything to do with the unfortunate situation that variable length lookbehinds are not allowed in Perl 5? Will variable length lookbehinds be allowed in Perl 6? (Though I'm not sure this is even relevant to the concept of pattern reversability, just a guess...)

Googling on this concept was unhelpful.

Thanks for your wisdom!

thomas.

Replies are listed 'Best First'.
Re: What does it mean that a "pattern cannot be reversed?"
by TimToady (Parson) on Jan 07, 2005 at 00:41 UTC
    To answer your questions in reverse order: yes, yes, and a non-reversible pattern would be one with strange side effects, or one that calls out to a subrule that can't be reversed. Most ordinary patterns can be reversed, and of course all constant patterns can be reversed. It's not so obvious that you can look for .*? and such in reverse, but you can.

    The only alternative is to reparse the string from the beginning (or from somewhere earlier in the string) and see if you happen to come out to the same place. This strikes me like one of those super-inefficient sorting algorithms where you shuffle the elements into a random order and then check to see if you happened to get them in order this time. If not, try a different guess. The intent of the verbiage in the synopsis is to rule out such a guessing implementation, which people tend to fall into because the notion of reversing the syntax tree is rather exotic until you think about it a while. But tree reversal is the right way to do it in a backtracking rule engine, so that's how Perl 6 will do it.

      Thanks...

      What would be interesting, if anybody is up for it, is an actual concrete example of a nonreversible patttern, in either perl5 or perl6 syntax (or both).

      (?<=(a|aa))
      is not permitted in perl5, because it is a variable length lookbehind. Glad to hear we get these in perl6.

      Could someone throw out an example of what we can't do in perl6?

      Maybe this is academic, but I like knowing the limits of pattern matching, as well as its power.

      UPDATE: this seems to have just been answered, or at least a good attempt was made, by hv below, who must have been posting same time I was.

Re: What does it mean that a "pattern cannot be reversed?"
by hv (Prior) on Jan 07, 2005 at 13:28 UTC

    Simple patterns can be reversed: /foo/ can be reversed to /oof/.

    Many more complex patterns can be reversed, though it isn't always obvious whether they'd match in exactly the same way: consider for example:

    "mississippi" ~~ /p<after (i.*?p)>/; # reverse to /(p.*?i)/ ?? # or maybe /(p.*i)/ ??
    in which you might expect the parens to catch "ississip" (ie start the match at the earliest available position) rather than "ip".

    However other examples cannot be reversed at all: in general, this might include anything that involves code assertions. Precisely what constitutes a reversible pattern has not yet been fully characterised, and to some extent it will depend on what we decide are the invariants that must be preserved as in the example above.

    Hugo

      I think it should keep .*? as .*? on the theory that the actual underlying abstraction is "longest/shortest", not "earliest/latest".

      I think code assertions can still work unless they make assumptions like "$1 is always bound before $2". (And I don't think we should reverse $1 and $2 for them. Reversing a lexical scope would be evil.)

      But basically, I don't know what the limits are yet. All I know is that Perl 5 sets them too tight, and we can push them out a little. Rule matching can be more powerful than a locomotive, but there will always be a bit of kryptonite in the world. Language designers tend to concentrate on the locomotive rather than the kryptonite.

        FWIW, relevant reading to this issue at sexeger.

        SNIP:

        "If we're able to look at the specific nodes of a regular expression, then we should be able to create a regex that works from the BACK of a string to the FRONT. What do I mean?

        ....."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://420007]
Approved by Jenda
Front-paged by grinder
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-03-29 09:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found