Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Non-greedy regex behaves greedily

by linuxer (Curate)
on Jul 27, 2008 at 17:43 UTC ( [id://700396]=note: print w/replies, xml ) Need Help??


in reply to Re: Non-greedy regex behaves greedily
in thread Non-greedy regex behaves greedily

Your regex doesn't force a non-greedy behaviour.

I'll try to explain with a simplified text example:

my $text = <<TEXT; 000ABCDEFABCGHI TEXT if ( $text =~ m{(ABC.*?)$} ) { print $1, $/; }

The engine reads $text from left to right and will have a try with starting at the first "ABC", using the complete following string until end of line. As that's exactly what the regex requested, this result is returned. There's no condition which forces the engine to search for a shorter result. There will be no second run which checks, if the current result may contain a shorter result.

The first valid match will be returned; this isn't always the best match.

Replies are listed 'Best First'.
Re^3: Non-greedy regex behaves greedily
by kovacsbv (Novice) on Jul 27, 2008 at 23:36 UTC
    Ok, is there a nice detailed description of the engine that would fill in what causes a second run and what the "?" does exactly?

    This behavior isn't very intuitive.

    Also, is there another way to get the desired result other than the ugly hack I posted below?
      You can force the regex engine to start looking for </a> at the end of the string and work forwards by consuming all the string to start with and backtracking character by character:
      my $string = q!Back to STATES Menu</font></a></h3> <p align="center">< +a href="index.htm"><img src="home2.gif" alt="Home" border="0" width=" +106" height="30"></a></p> </body> </html>!; if ( $string =~ m!^.*(</a>.*?)$! ) { print "got $1\n"; }
      but there's usually a better way to get what you want done.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://700396]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-26 06:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found