What you'll probably want to do is Walk through the bulleted list and for each bullet:
- Pull off the first line of text (the name)
- Then get the link from the link(s?) that comes after that bullet, but before the next.
It may be difficult to do with TokeParser since the generated page doesn't close their list-element ( <li>) tags, and I don't know what it can or can't handle. If it does not work, as much as It's usually unwise to advocate it, since you have a "known format" you're working with, it would be possible to parse this page with regular expressions:
my @document = split /\n/, $document;
my $entry = '';
foreach my $line ( @document ) {
m|^<li>(.*?)</strong>| and do { $entry = $1; next };
m|<a href=(.*?)>(.*?)</a>| and do {
my $url = $1;
$url =~ s/CMD=TABLES/CMD=RET/;
my $text = $2;
if ($text eq "STF1A" || $text eq "STF3A") {
print OUTPUT "<a href=$url/FMT=HTML/T=P1>$entry $text</a><
+br />\n";
}
next;
};
}