Re^2: help with lazy matching

Replies are listed 'Best First'.
Re^3: help with lazy matching by nlwhittle (Beadle) on Jan 05, 2015 at 22:27 UTC
The non-greedy modifier simply means "match as little as possible while still getting a successful match". All regex matches in Perl Compatible Regular Expressions always match leftmost first; in your case the first slash. Where the non-greedy operator might have worked, for example, is if you wanted to only match 'foo'. Then you could write: `if ( /\/(.+?)\// )` This will match the first slash, then non-greedily match any other characters until another slash is reached. If you didn't use the non-greedy modifier here, you would match everything between the first and last slash (i.e. 'foo/bar/baz'). --Nick	[reply] [d/l]
Re^4: help with lazy matching by Special_K (Monk) on Jan 05, 2015 at 23:07 UTC
I think the source of my confusion was not knowing that regular expressions in perl always start matching from the left side. If the regular expression could start matching from anywhere, then using the non-greedy modifier could give the behavior I was expecting in my original post, i.e. matching "bar".	[reply]
Re^5: help with lazy matching by davido (Cardinal) on Jan 05, 2015 at 23:51 UTC
This is not a Perl-specific issue. The "Leftmost" rule is one of the features of a NFA-based regular expression engine, which includes Perl, PHP, Python, and most other commonly used regular expression implementations. So now that you're aware of it with respect to Perl, you've learned something that can be applied to most other languages that implement regexes as well! :) Dave	[reply]
Re^3: help with lazy matching by Anonymous Monk on Jan 05, 2015 at 22:58 UTC
I like the description in the Camel: ... regular expressions will try to match as early as possible. This even takes precedence over being greedy. Since scanning happens left to right, the pattern will match as far left as possible, even if there is some other place where it could match longer. (Regular expressions may be greedy, but they aren’t into delayed gratification.) ... (copied from the free sample material on the O'Reilly website, `http://cdn.oreillystatic.com/oreilly/booksamplers/9780596004927_sampler.pdf`, book page 44) Another key thing to realize is that the `$` does not change the behavior to scanning from right-to-left.	[reply] [d/l] [select]
Re^3: help with lazy matching ( .+? versus [^/]+? rxrx -Mre=debug ) by Anonymous Monk on Jan 05, 2015 at 22:37 UTC
Why does it not work that way? the regex metacharacter dot (.) means match any character ( except newline or including newline) it starts to match after the first / is matched and it matches all subsequent / This is a FAQ but hard to search for FAQ :) use re 'debug'; and watch it work Read more... (3 kB) use rxrx and watch it work Read more... (2 kB)	[reply] [d/l] [select]


No such thing as a small change
	PerlMonks