Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: about retrieving and parsing html without writing on disk

by learnedbyerror (Monk)
on Apr 15, 2018 at 19:03 UTC ( #1212944=note: print w/replies, xml ) Need Help??


in reply to about retrieving and parsing html without writing on disk

The short answer is yes, you can. I don't use the exact parsing utilities that you are using, but I routinely WWW::Mechanize and parse the content

Something like the following should work for you. NOTE: I did not test this exact code

use HTML::TableExtract; use WWW::Mechanize; my $user_agent='Mozilla/5.0 (Windows; U; Windows NT 6.1; nl; rv:1.9.2. +13)Gecko/20101203 Firefox/3.6.13'; my $mech = WWW::Mechanize->new(autocheck => 0, agent = $user_agent ); if ( $mech->success ) { my $html_string = $mech->content; my $headers = ['col1', 'col2', 'col3', 'col4', 'col5']; my $te = HTML::TableExtract->new( headers => $headers ); my @tables = $te->parse($html_string)->tables; } ...

lbe

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1212944]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2020-09-30 00:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If at first I donít succeed, I Ö










    Results (155 votes). Check out past polls.

    Notices?