Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Isn't /m for multiline regex?

by afoken (Chancellor)
on Apr 19, 2010 at 11:39 UTC ( [id://835462]=note: print w/replies, xml ) Need Help??


in reply to Isn't /m for multiline regex?

what am I doing wrong here?

You try to parse HTML using regular expressions. That simply can't work, due to the way HTML is defined. Use a HTML parser, a CPAN search will list several.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^2: Isn't /m for multiline regex?
by ww (Archbishop) on Apr 19, 2010 at 12:45 UTC
    The parent node overreaches.

    While it's true that it's not generally a good idea to try to parse html with regexen, "(t)hat simply can't work is not.

    It can be done... and often is for simple cases... but is fraught with so many difficulties that it's inadvisable. What's more, trying to parse html of any complexity with tools other than the well-tested modules referenced above flies in the face of the mantra 'don't re-invent the wheel.'

      from perlfaq6

      Here's code that finds everything between START and END in a paragraph:
      undef $/; # read in whole file, not just one line or paragraph while ( <> ) { while ( /START(.*?)END/sgm ) { print "$1\n"; } }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://835462]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-16 14:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found