http://qs321.pair.com?node_id=42367


in reply to HTML Matching

A regexp will probably not do it right. Your regexp will fail on this example:

<input type="text" value=">">

Why not just use HTML::Parser ? That would be the correct way of doing it. And it is fast too, both to write and execution. Just subclass HTML::Parser, and use the text method, like this:

package MyParser; use base 'HTML::Parser'; sub text { my($self, $origtext, $is_cdata) = @_; print $origtext; }
The above code was just copied and pasted from the HTML::Parser pod file.

Autark.