Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Perl Regular Expression inconsistency

by GrandFather (Saint)
on Mar 14, 2006 at 11:09 UTC ( [id://536516]=note: print w/replies, xml ) Need Help??


in reply to Perl Regular Expression inconsistency

I see the same behaviour with AS Perl v5.8.7. What is even more interesting is that there is some odd inconsistency in the way different meta characters are handled:

  • /{/ no error
  • /}/ no error
  • /]/ no error
  • /[/ error
  • /)/ error
  • /(/ error

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Perl Regular Expression inconsistency
by wazoox (Prior) on Mar 14, 2006 at 11:33 UTC
    Well, it doesn't seem that odds to me. [ starts a range, so it must be escaped; ( starts a selection, so it must be escaped; { doesn't mean anything special, so it hasn't to be escaped.
      Doesn't '{' start a {} quantifier?
        Yes, you're right, however it must then follow some token (sorry I'm not very proficient in regexp :)
        I Don't think it to be an inconsistency. { doesn't represent anything, once your inside regex (i.e. /../), and hence it need Not be escaped.

      By that logic an unmatched ')' doesn't mean anything so shouldn't need to be escaped, yet it generates an error.

      japhy's eplanation is satisfying for unmatched '{' and '}' and extends to an unmatched ']' too. Why not extend the argument to an unmatched ')'? Is that so much more likely a "You Really Mean That?" error that it's worth special casing?


      DWIM is Perl's answer to Gödel

        I think this is a kind of Huffman coding issue - in regular expressions, parens are far more often used for capturing than for literal matching; on balance a lone paren is more likely to be intended to be part of a capturing pair, so it makes sense to assume that and raise an error.

        Braces on the other hand are rarely used in their meta sense, so it helps more people to assume that braces are intended for a literal match except when they strictly match the pattern required to express a repetition count.

        That I think is the intention, at least. You could argue that it should be inverted - that because the meta braces are used more rarely, the average user needs more help to use them correctly - but for this kind of trade-off perl tends to favour the expert user rather than the learner.

        Hugo

        I think the answers lies into the source code :) Perhaps ')' doesn't work because of some obscure compatibility problem or... OK, I'll check perl source, promise !
        update: I've quickly browsed thru perl code (toke.c especially) and I didn't find the clear difference... Well, I'll dig it deeper later.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://536516]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-18 06:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found