Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Looks like switching from
my $regex = '(2[0-4]|1?[0-9])?[0-9]|25[0-5]';
to
my $regex = qr/(2[0-4]|1?[0-9])?[0-9]|25[0-5]/;
seems to fix it. I don't immediately see why though.

It's a regex metacharacter/operator precedence issue.

The regex  | (alternation) operator has a low (the lowest?) precedence among regex operators. When a raw string like
    my $regex = '(2[0-4]|1?[0-9])?[0-9]|25[0-5]';
is interpolated into
    /^$regex$/
the final regex becomes
    /^(2[0-4]|1?[0-9])?[0-9]|25[0-5]$/

The  ^ start-of-string assertion is effectively grouped and evaluated with the  (2[0-4]|1?[0-9])?[0-9] expression and disconnected by the alternation from the  25[0-5]$ expression. IOW, the regex will match any string with a  [0-9] at the minimum (everything else is optional) at the start or with a  25[0-5] at the end, and nothing else in the string matters!

c:\@Work\Perl\monks>perl -wMstrict -le "my $regex = '(2[0-4]|1?[0-9])?[0-9]|25[0-5]'; while (<>) { chomp; if ($_ =~ /^$regex$/) { print qq{'$_' matched}; } else { print qq{'$_' did not match}; } } 100 '100' matched z100 'z100' did not match z255 'z255' matched z250 'z250' matched 100z '100z' matched 99 '99' matched 9999999 '9999999' matched 99Yikes!99 '99Yikes!99' matched 1 '1' matched 11 '11' matched 111 '111' matched 22 '22' matched 222 '222' matched 33 '33' matched 333 '333' matched

In contrast, choroba used a  qr// operator to define the  $regex object (in fact, a Regexp object). (Update: See  qr// in Regexp Quote-Like Operators in perlop.) This is not the same as a raw string! Among other things, the  qr// operator adds a non-capturing  (?:pat) group around the whole expression that, in this application, effectively preserves the desired association between start- and end-of-string assertions after interpolation:
    my $regex = qr/(2[0-4]|1?[0-9])?[0-9]|25[0-5]/;
becomes
    (?:(2[0-4]|1?[0-9])?[0-9]|25[0-5])
and is interpolated into
    /^$regex$/
as
    /^(?:(2[0-4]|1?[0-9])?[0-9]|25[0-5])$/
which can be read as "start-of-string, then one of a set of alternations in the range 0-255, then end-of-string" and which gives the desired number range discrimination.

c:\@Work\Perl\monks>perl -wMstrict -le "my $regex = qr/(2[0-4]|1?[0-9])?[0-9]|25[0-5]/; while (<>) { chomp; if ($_ =~ /^$regex$/) { print qq{'$_' matched}; } else { print qq{'$_' did not match}; } } " 0 '0' matched 1 '1' matched 100 '100' matched 1000 '1000' did not match 25 '25' matched 255 '255' matched 256 '256' did not match a1 'a1' did not match 1a '1a' did not match 11 '11' matched 111 '111' matched 222 '222' matched 333 '333' did not match

Bottom line: Wherever possible, prefer  qr// to raw strings for regex expressions.

Please see perlre, perlretut, and perlrequick.

Update: Incidentally, the regex  qr/(2[0-4]|1?[0-9])?[0-9]|25[0-5]/ does not match the strings  000 001 012 etc. (Update: The regex does match  00 01 02 etc.) If this is an issue, I suggest
    qr{ [01]? \d? \d | 2 [0-4] \d | 25 [0-5] }xms
instead, but whatever you use, verify it with something like Test::More as choroba did!


Give a man a fish:  <%-{-{-{-<


In reply to Re^3: Is this a bug in perl regex engine or in my brain? by AnomalousMonk
in thread Is this a bug in perl regex engine or in my brain? by nikmit

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2021-12-02 10:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    R or B?



    Results (18 votes). Check out past polls.

    Notices?