Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: Parse html file

by tangent (Vicar)
on Sep 24, 2018 at 18:01 UTC ( #1222925=note: print w/replies, xml ) Need Help??


in reply to Re^2: Parse html file
in thread Parse html file

I see you have HTML::TreeBuilder installed. This is one way you can use that:

Update: changed slightly to avoid errors

my $tree = HTML::TreeBuilder->new; $tree->parse_file($file); $tree->eof; my @divs = $tree->find_by_attribute('class','formline'); for my $div (@divs) { my $label_div = $div->look_down('class','formlabel') or next; my $label = $label_div->as_text; my $input_div = $div->look_down('class','forminput') or next; my $input = $input_div->as_text; print "$label $input\n"; }

Replies are listed 'Best First'.
Re^4: Parse html file
by TonyNY (Beadle) on Sep 24, 2018 at 19:08 UTC
    Thanks tangent, your solution pulled the data.

    any idea what this error is referring to?

    Can't call method "as_text" on an undefined value

      There are probably some divs with class 'formline' but without the interior divs - I have updated the code above to deal with that.

        Hi tangent,

        Could this code be modified to extract this kind of output?

        hr div Relay Status Information br div div FillDB File Size Limit: div 0.0% ( 0 / 3145728 Bytes ) div div FillDB File Count Limit: div 0.0% ( 0 / 10000 Files ) div div Timeout for queries in queue: div 60 minutes div div Size of queries in queue: div 0.0% ( 0 / 104857600 Bytes ) div div Size of results queue: div 0.0% ( 0 / 104857600 Bytes ) br

        Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1222925]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2022-01-21 10:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (57 votes). Check out past polls.

    Notices?