Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: about retrieving and parsing html without writing on disk

by learnedbyerror (Monk)
on Apr 15, 2018 at 19:03 UTC ( [id://1212944]=note: print w/replies, xml ) Need Help??


in reply to about retrieving and parsing html without writing on disk

The short answer is yes, you can. I don't use the exact parsing utilities that you are using, but I routinely WWW::Mechanize and parse the content

Something like the following should work for you. NOTE: I did not test this exact code

use HTML::TableExtract; use WWW::Mechanize; my $user_agent='Mozilla/5.0 (Windows; U; Windows NT 6.1; nl; rv:1.9.2. +13)Gecko/20101203 Firefox/3.6.13'; my $mech = WWW::Mechanize->new(autocheck => 0, agent = $user_agent ); if ( $mech->success ) { my $html_string = $mech->content; my $headers = ['col1', 'col2', 'col3', 'col4', 'col5']; my $te = HTML::TableExtract->new( headers => $headers ); my @tables = $te->parse($html_string)->tables; } ...

lbe

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1212944]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-28 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found