Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Retrieve, modify, & display web page

by AidanLee (Chaplain)
on Jan 03, 2002 at 19:49 UTC ( [id://135993]=note: print w/replies, xml ) Need Help??


in reply to Retrieve, modify, & display web page

What you'll probably want to do is Walk through the bulleted list and for each bullet:

  1. Pull off the first line of text (the name)
  2. Then get the link from the link(s?) that comes after that bullet, but before the next.

It may be difficult to do with TokeParser since the generated page doesn't close their list-element ( <li>) tags, and I don't know what it can or can't handle. If it does not work, as much as It's usually unwise to advocate it, since you have a "known format" you're working with, it would be possible to parse this page with regular expressions:

my @document = split /\n/, $document; my $entry = ''; foreach my $line ( @document ) { m|^<li>(.*?)</strong>| and do { $entry = $1; next }; m|<a href=(.*?)>(.*?)</a>| and do { my $url = $1; $url =~ s/CMD=TABLES/CMD=RET/; my $text = $2; if ($text eq "STF1A" || $text eq "STF3A") { print OUTPUT "<a href=$url/FMT=HTML/T=P1>$entry $text</a>< +br />\n"; } next; }; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://135993]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-18 05:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found