http://qs321.pair.com?node_id=263178


in reply to hepl with a regex problem

The first thing you should try is "non-greedy" regular expression match. "Non-greedy" means "match as little as possible" and all you need to change is to add a question mark:

/.*Volume<br>(.*?)<\/font><\/td>.*/

Hope this helps

Update: reading Juerd's answer I am starting to think that possibly I have misinterpreted the question ;) So yes, if you are having problems extracting the match, it is in $1. If you are just having a problem of something matching too much -- a greedy regexp may be to blame.

As far as I can remember it took me days to understand greedy/non-greedy regexps when I started learning Perl ;)

Replies are listed 'Best First'.
Re: Re: hepl with a regex problem
by Nkuvu (Priest) on Jun 04, 2003 at 23:41 UTC

    I'm thinking that AnonyMonk was referring to a greedy match. At least, that's how I read it at first.

    And if you know that what you're looking for is a number alone, you can use something like m!Volume<br>([\d,]+)</font></td>! for the regex. This will capture numbers that may or may not have a comma in them. It won't match negative numbers, or numbers with decimal points, or fractions, etc. Note that the m at the beginning allows you to use another delimiter instead of /, so you don't have to escape the / characters in the closing HTML tags. Just another way to do it...