Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Help with xpath and TreeBuilder

by Gangabass (Vicar)
on Jun 29, 2012 at 11:04 UTC ( [id://979098]=note: print w/replies, xml ) Need Help??


in reply to Help with xpath and TreeBuilder

It's really easy:

#!/usr/bin/perl use Modern::Perl; use HTML::TreeBuilder::XPath; use Data::Dumper; local $/; my $html = <DATA>; my $tree = HTML::TreeBuilder::XPath->new_from_content($html); my @values = $tree->findvalues('/html/body/p/a'); say Dumper(@values); __DATA__ <html> <body> <p><a href="1.html">Link Text 1</a></p> <p><a href="2.html">Link Text 2</a></p> <p><a href="3.html">Link Text 3</a></p> <p><a href="4.html">Link Text 4</a></p> </body> </html>

Replies are listed 'Best First'.
Re^2: Help with xpath and TreeBuilder
by Anonymous Monk on Jun 29, 2012 at 12:09 UTC

    Its even easier :)

    my @xpaths = qw{ /html/body/p/a /html/body/p[2]/a /html/body/p[3]/a .... /html/body/p[66]/a }; my $allXpaths = join ' | ', @xpaths; my @values = $tree->findvalues;

    I don't know about other xpath interpreters, but treebuilder::xpath (and xsh ) allows this

    This query is probably faster

    /html/body/p[ position()=1 or position()=2 or position()=3 or position()=66 ]/a

    Or this one

    /html/body/p[ ( ( position() > 0 and position() < 4 ) or ( position()=66 ) ) ]/a

    Though this one won't work with your html  //a[ position()=4] because each //a is at  //a[ position() = 1] because each //a is the only (first) child of its parent ( p ) --- I guess now I know how position() works

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://979098]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-18 07:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found