Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

XML::LibXML drives me to drinking

by tunafish (Beadle)
on Oct 22, 2016 at 23:37 UTC ( [id://1174524]=perlquestion: print w/replies, xml ) Need Help??

tunafish has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to figure out XML::LibXML. It's rough going. I need to be able to access the text content of a node by name. Here is my code:

#!/usr/bin/perl use strict; use XML::LibXML; my $string = qq~<?xml version="1.0"?> <ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceS +ervice/2013-08-01"> <Items> <Item> <ASIN>B01KI4JSQY</ASIN> </Item> </Items> </ItemLookupResponse> ~; my $parser = XML::LibXML->new->load_xml(string => $string, {no_blanks +=> 1}); my $xml = XML::LibXML::XPathContext->new($parser); $xml->registerNs('x', 'http://webservices.amazon.com/AWSECommerceServi +ce/2013-08-01'); # Parse items foreach my $item ($xml->findnodes('/x:ItemLookupResponse/x:Items/x:Ite +m', $parser)){ print $item->firstChild->nodeName, "\n"; print $item->firstChild->toString, "\n"; print $item->findvalue('ASIN'), "\n"; print $item->findvalue('./ASIN'), "\n"; print $item->findvalue('./ASIN', $item), "\n"; }

Expected result:

ASIN <ASIN>B01KI4JSQY</ASIN> B01KI4JSQY B01KI4JSQY B01KI4JSQY

Actual result:

ASIN <ASIN>B01KI4JSQY</ASIN>

Probably I'm just misunderstanding something in the docs. But I don't know what it is. I tried $item->findvalue('x:ASIN'), but that threw an error. Please help. I have a family. If I become an alcoholic, they will suffer.

Replies are listed 'Best First'.
Re: XML::LibXML drives me to drinking
by Your Mother (Archbishop) on Oct 23, 2016 at 00:44 UTC
      Unfortunately not. This gets me to the point at which I am already (and what a struggle THAT was!) but I can already access nodes from the root level. Now, however, I need to access named sub-nodes of a particular node (in my example, ASIN of /ItemLookupResponse/Items/Item). It's not really helpful to access it from root (/ItemLookupResponse/Items/Item/ASIN) because there may be several Item nodes.

        My node had pretty good clues, actually. :P Try this–

        use strict; use XML::LibXML; my $string = <<""; <?xml version="1.0"?> <ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceS +ervice/2013-08-01"> <Items> <Item> <ASIN>B01KI4JSQY</ASIN> </Item> </Items> </ItemLookupResponse> my $doc = XML::LibXML->new->load_xml(string => $string, {no_blanks => +1}); my $xc = XML::LibXML::XPathContext->new($doc); $xc->registerNs( x => $doc->getDocumentElement->namespaceURI ); for my $item ( $xc->findnodes('//x:ItemLookupResponse/x:Items/x:Item') + ) { print $item->firstChild->nodeName, "\n"; print $item->firstChild->toString, "\n"; print $xc->findvalue('x:ASIN', $item), "\n"; }
Re: XML::LibXML drives me to drinking
by Anonymous Monk on Oct 23, 2016 at 01:47 UTC

    What you're wanting is $xpathcontext->findnodes( $xpathstring, $startnode )

    I much prefer my version $node->F($xpathstr, 'prefix'=>'http...')

    #!/usr/bin/perl -- use strict; use XML::LibXML; my $string = qq~<?xml version="1.0"?> <ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceS +ervice/2013-08-01"> <Items> <Item> <ASIN>B01KI4JSQY</ASIN> </Item> </Items> </ItemLookupResponse> ~; my $dom = XML::LibXML->new(qw/ recover 2 / )->load_xml(string => $str +ing, {no_blanks => 1}); # *register* namespace $dom->F( '/', 'x', 'http://webservices.amazon.com/AWSECommerceService/2013-08-01 +' ); foreach my $item ( $dom->F('/x:ItemLookupResponse/x:Items/x:Item' )){ print $item->firstChild->nodeName, "\n"; print $item->firstChild->toString, "\n"; print $item->F('x:ASIN')->shift()->string_value, "\n"; print $item->F('./x:ASIN/text()'), "\n"; print $item->F('./x:ASIN')->[0]->textContent, "\n"; } sub XML::LibXML::Node::F { my $self = shift; my $xpath = shift; my %prefix = @_; our $XPATHCONTEXT; $XPATHCONTEXT ||= XML::LibXML::XPathContext->new(); while( my( $p, $u ) = each %prefix ){ $XPATHCONTEXT->registerNs( $p, $u ); } $XPATHCONTEXT->findnodes( $xpath, $self ); }

      Wait, are you saying I need to create a new XML::LibXML::XPathContext object for each node as I loop through them?!

      I modified my code in the following way:

      # Parse items foreach my $item ($xml->findnodes('/x:ItemLookupResponse/x:Items/x:Ite +m', $parser)){ my $item_xml = XML::LibXML::XPathContext->new($item); $item_xml->registerNs('x', 'http://webservices.amazon.com/AWSE +CommerceService/2013-08-01'); print $item_xml->findvalue('x:ASIN'), "\n"; print $item_xml->findnodes('x:ASIN')->shift->textContent, "\n" +; }

      And bingo!

      B01KI4JSQY B01KI4JSQY

      Oy vey! Wouldn't it be much more convenient if a XML::LibXML::XPathContext object calling the findnodes() method returned another XML::LibXML::XPathContext object, rather than a XML::LibXML::Node or XML::LibXML::Element object?

      I haven't pulled this much hair in one day for a coding-related reason IN YEARS.

        Wait, are you saying I need to create a new XML::LibXML::XPathContext object for each node as I loop through them?!

        No, what gave you that idea?

        All XML::LibXML::XPathContext does is map prefixes to namespaces, and that only needs to be done once.

        See https://metacpan.org/pod/XML::LibXML::Node#findnodes

        I haven't pulled this much hair in one day for a coding-related reason IN YEARS.

        step away from the keyboard, lunch, nap, vacation, whatever it takes

Re: XML::LibXML drives me to drinking
by ikegami (Patriarch) on Oct 24, 2016 at 18:15 UTC

    You used the proper namespace here:

    $xml->findnodes('/x:ItemLookupResponse/x:Items/x:Item')

    But you forgot here:

    $item->findvalue('ASIN')

    Change the latter to

    $xml->findvalue('x:ASIN', $item)

    Other issues:

    • Your variable names are awful! You call your XML $string while $xml is an XPathContext object for which $xpc is recommended, and you call your document $parser where $doc or $dom make more sense.

    • You call load_xml as an object method, but it's a static method like new.

    • XML::LibXML::XPathContext isn't loaded by use XML::LibXML; in older versions of XML::LibXML.


    Fixed:

    #!/usr/bin/perl use strict; use warnings qw( all ); use feature qw( say ); use XML::LibXML qw( ); use XML::LibXML::XPathContext qw( ); my $xml = <<'__EOS__'; <?xml version="1.0"?> <ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceS +ervice/2013-08-01"> <Items> <Item> <ASIN>B01KI4JSQY</ASIN> </Item> </Items> </ItemLookupResponse> __EOS__ my $doc = XML::LibXML->load_xml(string => $xml, { no_blanks => 1 }); my $xpc = XML::LibXML::XPathContext->new(); $xpc->registerNs('x', 'http://webservices.amazon.com/AWSECommerceServi +ce/2013-08-01'); for my $item ($xpc->findnodes('/x:ItemLookupResponse/x:Items/x:Item', +$xml)) { say $item->firstChild->nodeName; say $item->firstChild->toString; say $xpc->findvalue('x:ASIN', $item); }
Re: XML::LibXML drives me to drinking
by Jenda (Abbot) on Oct 25, 2016 at 14:28 UTC

    Then why do you insist on this insanely overcomplicated and at the same time both over- and under- documented m[ae]ss of a module?

    Switch to XML::Twig or XML::Rules. For example like this:

    use strict; use XML::Rules; my $string = qq~<?xml version="1.0"?> <ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceS +ervice/2013-08-01"> <Items> <Item> <ASIN>B01KI4JSQY</ASIN> </Item> </Items> </ItemLookupResponse> ~; my $parser = XML::Rules->new( rules => { ASIN => [ '/ItemLookupResponse/Items/Item' => sub { print "ASIN: $_[ +1]->{_content}\n"; }, sub { print "ASIN at an unexpected place\n"; } ] }, xmlns => { 'http://webservices.amazon.com/AWSECommerceService/2013-08-01' + => '' } ); $parser->parse($string);

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Re: XML::LibXML drives me to drinking
by grantm (Parson) on Nov 11, 2016 at 01:43 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1174524]
Approved by kevbot
Front-paged by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-20 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found