Re: Isn't /m for multiline regex?


Perl-Sensitive Sunglasses
	PerlMonks

Re: Isn't /m for multiline regex?

by afoken (Chancellor)

on Apr 19, 2010 at 11:39 UTC ( [id://835462]=note: print w/replies, xml )

Need Help??

in reply to Isn't /m for multiline regex?

what am I doing wrong here?

You try to parse HTML using regular expressions. That simply can't work, due to the way HTML is defined. Use a HTML parser, a CPAN search will list several.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Comment on Re: Isn't /m for multiline regex?

Replies are listed 'Best First'.
Re^2: Isn't /m for multiline regex? by ww (Archbishop) on Apr 19, 2010 at 12:45 UTC
The parent node overreaches. While it's true that it's not generally a good idea to try to parse html with regexen, "(t)hat simply can't work is not. It can be done... and often is for simple cases... but is fraught with so many difficulties that it's inadvisable. What's more, trying to parse html of any complexity with tools other than the well-tested modules referenced above flies in the face of the mantra 'don't re-invent the wheel.'	[reply]
Re^3: Isn't /m for multiline regex? by GertMT (Hermit) on Apr 20, 2010 at 07:18 UTC
from perlfaq6 Here's code that finds everything between START and END in a paragraph: `undef $/; # read in whole file, not just one line or paragraph while ( <> ) { while ( /START(.*?)END/sgm ) { print "$1\n"; } }` [download]	[reply] [d/l]

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://835462]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others meditating upon the Monastery: (3)

As of 2024-04-16 14:18 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found