comment on

A regexp will probably not do it right. Your regexp will fail on this example:

Why not just use HTML::Parser ? That would be the correct way of doing it. And it is fast too, both to write and execution. Just subclass HTML::Parser, and use the text method, like this:

package MyParser;
use base 'HTML::Parser';

sub text { 
    my($self, $origtext, $is_cdata) = @_;
    print $origtext;
}
[download]

The above code was just copied and pasted from the HTML::Parser pod file.

Autark.

In reply to Re: HTML Matching by autark
in thread HTML Matching by spaz

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Just another Perl shrine
	PerlMonks