Hi,
You do not want to use anything with "parser" in the name
#!/usr/bin/perl --
use strict;
use warnings;
use Mojo::DOM;
my $html = q{
<div class="post reply body-not-empty" id="reply_8735435">
(cut out for visibility)
<p class="body-line ltr ">The first 3 lines were 15% bait power, but t
+hen it fell to mere 5% and the last lines are literally 0%, try again
+ in a few days.</p>
</div>
(cut out for visibility)
<div class="post reply body-not-empty" id="reply_8735439">
(cut out for visibility)
<div class="body" >
<p class="body-line ltr ">
<a onclick="highlightReply('8735417', event);" href="/b/res/8735417.ht
+ml#8735417">>>8735417</a>
</p>
<p class="body-line ltr quote">>Reddit is a great place for discour
+se and there are many active subreddits where field professionals reg
+ularly answer questions on issues of health, science, engineering, et
+c</p>
<p class="body-line ltr ">Yeah, as far as content goes, Reddit kicks 8
+chan's ass. They have some great boards for serious academic discussi
+on.</p>
<p class="body-line empty">
};
my $dom = Mojo::DOM->new( $html );
for my $e ( $dom->find( 'div.reply' )->each ){
print $e->{id},"\n", $e->text, "\n\n";
}
__END__
reply_8735435
(cut out for visibility)
reply_8735439
(cut out for visibility)
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|