Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: HTML::TableExtract - ugly - is there better way?

by NetWallah (Canon)
on Apr 09, 2017 at 07:27 UTC ( [id://1187511]=note: print w/replies, xml ) Need Help??


in reply to HTML::TableExtract - ugly - is there better way?

Try this:
use strict; use warnings; use HTML::TableExtract; #Get HTML file and set up headers for HTML::TableExtract my $doc = 'nasdaq-stocks.txt'; my $html = do{ local $/=undef; open my $f,"<", $doc or die $!;<$f>}; my $headers = ['Symbol', 'Last Sale*', 'Change Net / %', 'Share Volume +']; #table 4 is advances. Need to do again for 5 decliners my $table_extract = HTML::TableExtract->new(count => 4, headers => $he +aders); $table_extract->parse($html); print join (" \t",@$headers),"\n"; for my $r ($table_extract->rows()){ my @cols = map {/([\w\.]+)\W+([\w\.\%]*)/} @$r; print join ("\t",@cols), "\n"; }
It would take a little work to put the "$" back in front of the "Last Sale*" amount, but this should get you started.

        ...it is unhealthy to remain near things that are in the process of blowing up.     man page for WARP, by Larry Wall

Replies are listed 'Best First'.
Re^2: HTML::TableExtract - ugly - is there better way?
by rtwolfe (Initiate) on Apr 10, 2017 at 03:31 UTC
    Thanks NetWallah. Had not heard about Map before. Still a little confusing. Not sure what @$r 'is'. Assume is parsed version of @cols. Need to decode your regex bit by bit. But, thanks so much for fast response.
      This site loves to explain code .. so keep those questions coming.

      @$r is the same as @{ $r } .
      $r is an array-reference ... adding the @{ } around it converts that into an array that can be iterated.

      'map' will "transform" each element of the array, returning a modified array.

      The "transformation" is in the form of a regular-expression - in this case, it extracts "word" type characters (\w), decimals (.) and so on. See perlre.

      The result of this map is stored in the array @cols, which is later printed.

      Hope this helps. If still unclear .. experiment, and come back with more specific questions.

              ...it is unhealthy to remain near things that are in the process of blowing up.     man page for WARP, by Larry Wall

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1187511]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-18 20:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found