You might find reading the module, as well as its documentation, saves typing ;)
use HTML::LinkExtor;
my @links = ();
my $html = join'',<DATA>; # much more elegant than => do { local $/; <
+DATA> };
sub extract_links {
my ($tag,undef,$url) = @_;
if($tag eq 'a') {
push @links, $url->host;
}
}
my $p = HTML::LinkExtor->new(\&extract_links,'http://foobar.com');
$p->parse($html);
print join "\n", @links;
__DATA__
<a href="http://www.foo.com">description</a>
<a href='http://www.foo.com'>image here</a>
<A href='http://foo-bar-publishers.co.uk'>image here</a>
Also, this "foo.com" request is rather silly, considering all the weirdo naming conventions out there (city.county.state.us ...)
update: no need for a patch, it's in there (at least in $VERSION = sprintf("%d.%02d", q$Revision: 1.31 $ =~ /(\d+)\.(\d+)/);).