Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Data structure question from XML::XPath::XMLParser

by marto (Cardinal)
on Mar 30, 2021 at 08:26 UTC ( [id://11130572]=note: print w/replies, xml ) Need Help??


in reply to Data structure question from XML::XPath::XMLParser

A short Mojo::DOM example to provide an alternative viewpoint:

#!/usr/bin/perl use strict; use warnings; use feature 'say'; use Mojo::DOM; my $html = '<!doctype html> <html class="no-focus-outline no-js" lang="en-US" data-modal-active="true"> <head> <title>test</title> </head> <body> <h1>test&nbsp;heading</h1 <div> <p>paragraph one <a href="https://example.com/one/two.html">one</a> example.</p> <p>paragraph two <a href="https://example.com/two/three.html">another</a> example.</p> </div> </body> </html>'; my $dom = Mojo::DOM->new( $html ); foreach my $e ( $dom->find('p > a')->each ){ say $e->{'href'}; } # or $dom->find('p > a')->each(sub { say $_->{'href'} } );

If the HTML is live you can access all of the above via Mojo::UserAgent, see this example.

Replies are listed 'Best First'.
Re^2: Data structure question from XML::XPath::XMLParser
by mldvx4 (Friar) on Mar 30, 2021 at 09:52 UTC

    Would Mojo::DOM be able to handle the new HTML5 elements like <article>, <footer>, and <section> and so on? Their presence is what are choking the old HTML::TreeBuilder::XPath and my reason for using Tidy to convert to valid, well-formed XML.

      Yes, it does.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11130572]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-19 05:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found