Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Regexp riddles

by broquaint (Abbot)
on Jul 17, 2003 at 12:42 UTC ( [id://275200]=note: print w/replies, xml ) Need Help??


in reply to Regexp to extract HTML link data

Under the blind assumption that your data won't be changing too much or becomes 'faulty' (otherwise you'd be using a parser right?) then something like this ought do
my $re = qr{ (?: <img \s+ .*? src=" ([^"]+) " .*? > )? <a \s+ .*? href=" ([^"]+) " .*? > }x; $in = '<td><img src="foo.jpg"><a href="index3.html">New index</a></td>'; my($href, $img) = grep defined, reverse $in =~ $re; print "href - $href\nimg - $img\n"; $in = '<td><a href="index3.html">New index</a></td>'; ($href, $img) = grep defined, reverse $in =~ $re; print "href - $href\nimg - $img\n"; __output__ href - index3.html img - foo.jpg href - index3.html img -
See. perlre for more info.
HTH

_________
broquaint

Replies are listed 'Best First'.
Re: Re: Regexp riddles
by hatter (Pilgrim) on Jul 17, 2003 at 14:06 UTC
    Thanks, that looks like the ticket. And your assumptions are correct - HTML parsers, um, no thank you. The input happens to be HTML, but it's very simple, fairly fixed format, and the problem could just as easily be expressed without HTML tags. And I'm hoping to wrap it all up in a map() (lots of data to iterate over) so it's much neater.

    Now, off to spend more time staring hard at the solution until its lessons burn themselves deep into my brain.

    thanks

    the hatter

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://275200]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-19 22:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found