Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: RE: embedded table remover

by salvadors (Pilgrim)
on Dec 31, 2000 at 05:41 UTC ( [id://49038]=note: print w/replies, xml ) Need Help??


in reply to RE: embedded table remover
in thread embedded table remover

HTML::Table is used for creating tables, rather than reading them. I suspect you meant HTML::TableExtract?

Again, however, I suspect that that won't really work either as it discards all information that it doesn't need.

You probably just want to build a handler onto HTML::Parser:

#!/usr/bin/perl -w use strict; use HTML::Parser; my $in_table = 0; my $p = HTML::Parser->new( default_h => [ sub { print shift unless $in_table }, 'text'], start_h => [ sub { shift eq 'table' ? $in_table++ : $in_table || print shift }, 'tagname, text'], end_h => [ sub { shift eq 'table' ? $in_table-- : $in_table || print shift }, 'tagname, text'], ); $p->parse_file(shift || die "Need a file") || die $!;

Tony

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://49038]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2024-04-25 05:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found