Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

XML::LibXML's findnodes is beahving weirdly

by nabeel (Novice)
on May 16, 2008 at 05:16 UTC ( [id://686858]=perlquestion: print w/replies, xml ) Need Help??

nabeel has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I've run into a bit of a problem in trying to use XML::LibXML. Here is an xml file:
<?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:os="http://a9.com/-/sp +ec/opensearch/1.1/"> <os:totalResults>233</os:totalResults> <os:startIndex>1</os:startIndex> <os:itemsPerPage>233</os:itemsPerPage> <os:Query role="request" searchTerms="test" startPage=""/> <entry> <id>34</id> <channel>1</channel> <title>Decision making: Marmite</title> + <description>A meeting of the 'business group', which is the committ +ee set up to steer the development of the new Squeezy Marmite bottle. + They discuss the qualities of the new bottle and agree to test i t with consumers.</description> <author></author> <size>36145156</size> <date>2007-09-07</date> <uri>assets/asset10000/aaiiaaaaaaafnknn.mpg</uri> <keylearningstage></keylearningstage> <keywords></keywords> <source>C4L Secondary Service</source> </entry> </feed>
and here is some code to parse that xml
use XML::LibXML; use lib '/usr/mi/lib'; use Encode; my $content; { $/ = undef; open ( FILE, "<$ARGV[0]" ); $content = <FILE>; close FILE; } $content = decode('utf8',$content, Encode::FB_CROAK); my $base = '/feed/entry'; my $parser = XML::LibXML->new(); my $xp = $parser->parse_string( $content); @nodeset = $xp->findnodes($base);
And @nodeset is empty. I've tried running this on debian sarge and Ubuntu 7.10. What is interesting is if I remove the 'xmlns=xmlns="http://www.w3.org/2005/Atom"' from the 'feed' element, everything seems to work fine. So, not sure if I am missing something. Need your help! Thanks Nabeel

Replies are listed 'Best First'.
Re: XML::LibXML's findnodes is beahving weirdly
by Cody Pendant (Prior) on May 16, 2008 at 08:33 UTC
    You have a default, unnamed namespace. This is a big annoyance in XML/XSL and may be causing your problem. All nodes without a namespace are in the Atom namespace, but its name is blank.

    I know how to get around it using XSL, but not with the module.

    In XSL, you need to make up your own namespace and call it as an attribute of the stylesheet, like xmlns:whatever="http://www.w3.org/2005/Atom". As long as it matches the URL in the input, all your content is in the "whatever" namespace if another namespace isn't specified.

    But like I say, that may not be the problem.



    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
      You can use XML::LibXML::XPathContext to register the namespace as follows:
      use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; # load the XML doc my $p = XML::LibXML->new; my $xml_file = do { local $/; <DATA> }; my $dom = $p->parse_string( $xml_file ); # register the namespace my $xc = XML::LibXML::XPathContext->new( $dom ); $xc->registerNs('ns', 'http://www.w3.org/2005/Atom'); # select using XPath my @nodes = $xc->findnodes( '/ns:feed/ns:entry'); print $_->toString for @nodes; __DATA__ <?xml version="1.0" encoding="UTF-8"?> <feed xmlns="http://www.w3.org/2005/Atom" xmlns:os="http://a9.com/-/sp +ec/opensearch/1.1/"> <os:totalResults>233</os:totalResults> <os:startIndex>1</os:startIndex> <os:itemsPerPage>233</os:itemsPerPage> <os:Query role="request" searchTerms="test" startPage=""/> <entry> <id>34</id> <channel>1</channel> <title>Decision making: Marmite</title> + <description>A meeting of the 'business group', which is the committ +ee set up to steer the development of the new Squeezy Marmite bottle. + They discuss the qualities of the new bottle and agree to test i t with consumers.</description> <author></author> <size>36145156</size> <date>2007-09-07</date> <uri>assets/asset10000/aaiiaaaaaaafnknn.mpg</uri> <keylearningstage></keylearningstage> <keywords></keywords> <source>C4L Secondary Service</source> </entry> </feed>
Re: XML::LibXML's findnodes is beahving weirdly
by Anonymous Monk on May 16, 2008 at 06:32 UTC
    perldoc -f binmode
Re: XML::LibXML's findnodes is beahving weirdly
by wfsp (Abbot) on May 16, 2008 at 07:35 UTC
    There's a mistake in the XML. It should be "jar" not "bottle".

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://686858]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-03-29 00:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found