Observe the following code:
#!/usr/bin/perl
use strict;
use HTML::LinkExtor;
my $html = <<EOF
<p>
<a href="http://example1.com/">
Example 1
</a>
</p>
<p>
<a href="http://example2.com/">
Example 2
</a>
</p>
<p>
<a href="http://example3.com/">
Example 3
</a>
</p>
EOF
;
# Print them out
print "Here's all links:\n";
my @return = find_my_links($html);
print join("\n", @return), "\n";
print "Here's all links #2:\n";
my @return2 = find_my_links($html);
print join("\n", @return2), "\n";
sub find_my_links {
my $str = shift;
my @links = ();
sub callback {
my($tag, %attr) = @_;
return if $tag ne 'a'; # we only look closer at <a ...>
my $link = $attr{href};
push(@links, $link);
}
my $p = HTML::LinkExtor->new(\&callback);
$p->parse($str);
undef $p;
undef $str;
return @links;
}
In this barebones example, I want to grab the links in found in $str, and I encapsulate all my stuff in the find_my_links() subroutine.
Easy enough, and it works! Once. If I call the parse() method twice, the second time will return an empty list. Why?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|