Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Strange regex to test for newlines: /.*\z/

by bloonix (Monk)
on May 22, 2007 at 09:39 UTC ( [id://616709]=note: print w/replies, xml ) Need Help??


in reply to Re: Strange regex to test for newlines: /.*\z/
in thread Strange regex to test for newlines: /.*\z/

I am not quite agreeable to the statement about what '.*' should match.

For my understanding '.' should ignore newlines always but if the operator /s is used. That means that '.+' and '.*' are just multiple searches of '.' and should still ignore newlines.

Now I understand $ and \z as the following... $ means to matches both the end and the newline before - quote perldoc - and \z only the end but not the newline.
print "foo matched\n"         if "foo\n"     =~  /^foo$/;
print "bar matched\n"         if "bar\n"     =~  /^bar$ \n/x;   # $ before end or newline
print "baz doesn't matched\n" if "baz\n"     !~  /^baz\z/;
print "foobar matched\n"      if "foobar\n"  =~  /^foobar\n\z/; # \z after newline

print "match foo\n"         if "foo\n" =~ /.*$/;     # .* ignore newline and $  is before newline
print "doesn't match bar\n" if "bar\n" !~ /.*\z/;    # .* ignore newline and \z is after  newline
print "match baz\n"         if "baz\n" =~ /.?\z/;    # but what the hell happends here?

for ( qr/(.?)\n\z/, qr/(.?)\z/ ) {
   "hello world\n" =~ $_;
   print "-$1-\n";
}

-d-
--
It seems that '.?' ignore the newline as expected and search on after the newline with '.?\z', because it searches _until_ '\z'. Also it seems that '.*' matches until the newline and not between '\n' and '\z'. '.*' is greedy, '.?' not. Maybe I missunderstand it.

Replies are listed 'Best First'.
Re^3: Strange regex to test for newlines: /.*\z/
by xicheng (Sexton) on May 22, 2007 at 18:17 UTC
    $ and \Z work pretty much the same in normal mode, both match the end of search string or before a string-ending newline. the difference between them lies in the multiline mode when you issue an 'm' modifier.

    \z means the real end of string even after the string-ending newline.

    If you use an 's' modifier, then things become more different but that's mainly coz of the '.' which changes its behaviors, not the three end-of-string anchors..

    check the following snippets:
    perl -e 'print "match\n" if "foo\n" =~ /.+$/' # ok # perl -e 'print "match\n" if "foo\n" =~ /.+\z/' perl -e 'print "match\n" if "foo\n" =~ /.+\Z/' # ok # perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\Z/' perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\z/' perl -e 'print "match\n" if "foo\n\n\n" =~ /.+$/' perl -e 'print "match\n" if "foo\n\n\n" =~ /.+$/m' # ok # perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\z/m' perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\Z/m' perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\Z/s' # ok # perl -e 'print "match\n" if "foo\n\n\n" =~ /.+\z/s' # ok # perl -e 'print "match\n" if "foo\n\n\n" =~ /.+$/s' # ok #
    BTW. When comparing between \z, \Z and $, it's probably better to avoid using .* or .? quanifiers the ways in your examples.

    BTW. my previous statement about \Z had some error and I have updated that post.

    Regards,
    Xicheng

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://616709]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-03-28 22:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found