Others have mentioned how to find URLs, but you may want to double check whether or not the URLs are actually valid, using a module like LWP::Simple or LWP::UserAgent For instance, for all urls @urls you find, you might do something like:
# assuming you've already populated @urls
# and done:
use LWP::UserAgent;
use strict;
use warnings;
# try this:
my @old_urls = @urls;
@urls = ();
my $user_agent = LWP::UserAgent->new;
while (@old_urls) {
my $url = shift (@old_urls);
my $response = $user_agent->get($url);
if ($response->is_success) {
push @urls, $url;
# or, if you want to get more detailed:
# push @urls, {
# url => $url,
# type => $response->content_type,
# };
}
}
Want to support the EFF and FSF by buying cool stuff? Click
here.