Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Formating a HTML document to show certain text.

by Anonymous Monk
on Mar 26, 2011 at 22:57 UTC ( [id://895706]=note: print w/replies, xml ) Need Help??


in reply to Formating a HTML document to show certain text.

  1. $ lwp-download http://www.imreportcard.com/products/the-elevation-group
    Saving to 'the-elevation-group.htm'...
    35.2 KB received
     

  2. $ perl htmltreexpather.pl the-elevation-group.htm 2>NUL |grep -A3 "^Product Description$" Product Description
    /html/body/div/div[3]/div/div/div[6]
    //div[@id='leftColTop']/div[6]
    //div[@id='leftColTop']/div[@class='heading']
     

  3. use HTML::TreeBuilder::XPath; my $tree= HTML::TreeBuilder::XPath->new; $tree->parse_file( "the-elevation-group.htm"); for my $n( $tree->findnodes( q#//div[@id='leftColTop']/div[@class='heading']# ) ){ print $n->getValue, "\n"; } __END__ Product Description Detailed Overview Reputation Domain "Whois"
  4. repeat

Replies are listed 'Best First'.
Re^2: Formating a HTML document to show certain text.
by Anonymous Monk on Mar 28, 2011 at 06:52 UTC
    #!/usr/bin/perl -- use strict; use warnings; use HTML::TreeBuilder::XPath; Main( @ARGV ); exit( 0 ); sub Main { my $tree = HTML::TreeBuilder::XPath->new; $tree->parse_file( "the-elevation-group.htm" ); my $XpathXpr = join '|', q#//div[@id='leftColTop']/div[@class='heading']#, q#//div[@id='leftColTop']/div[@class='heading']/following-sibling::nod +e()[1]#, ; for my $node ( $tree->findnodes_as_strings( $XpathXpr ) ){ print "$node\n\n"; } } __END__
    Read http://w3schools.com/xpath/default.asp for gentle introduction to xpath.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://895706]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-20 01:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found