Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: XML::LibXML and XML Namespaces (processing OpenOffice documents)

by dakkar (Hermit)
on Mar 11, 2003 at 15:06 UTC ( #242059=note: print w/replies, xml ) Need Help??


in reply to XML::LibXML and XML Namespaces (processing OpenOffice documents)

And now a small lesson on XML namespaces

no, seriously

In namespace-aware XML documents, an element name is a qualified name (qname), composed by a prefix and a local name, separated by a colon (:). The prefix is bound to a URI via a namespace declaration. Example:

<first> <ns:second> <my:third xmlns:my="someURI"> <fourth xmlns="otherURI"> <fifth/> </fourth> </my:third> </ns:second> </first>

Let's read that. The element named first has a local name of first, and belongs to no namespaces. The element named ns:second is wrong, since namespace-aware parsers require the prefix to be declared, and ns is not. my:third belongs to the someURI namespace, which is locally bound to the my prefix. fourth belongs to the (locally default) namespace otherURI, as does fifth.

Hope this is clear enough...

Your problem

XPath has some problems with namespaces, namely that an XPath expression is interpreted in the context element (which in the case of your program is the invocant of findnodes). So the prefixes are resolved using the namespace declarations visible from that node. This forces you to know the prefixes used in the document, instead of the URIs, which creates the problems I said earlier (prefixes are not unique, URIs are).

Anyway, your problem is much easier: $tree is a XML::LibXML::Document, which has no knwoledge of namespaces, since they are declared (at the earliest) on the document element. This is why the second form works. BTW, in the previous examples (disregarding the ns:second element), if you did:

$docElem->findnodes('//my:third');
It wouldn't work, since the my prefix is not defined on the document element...

-- 
        dakkar - Mobilis in mobile

Replies are listed 'Best First'.
Re: Re: XML::LibXML and XML Namespaces (processing OpenOffice documents)
by bart (Canon) on Mar 11, 2003 at 19:34 UTC
    Please go on. What is the meaning of the URI's? What should these point to?
      Namespace URIs don't really point anywhere. They are simply a mean to create non-conflicting namespaces for tags and attributes in XML documents. If you happen to own domain mydomain.com and you design your own DTD you may for example use http://mydomain.com/mydtd URI for tags and attributes you use. It is unluckely to conflict with DTDs created by somebody else as they will have URIs with other domain part and you can always choose non-conflicting URIs in your own domain for other your DTDs.

      Read Namespaces in XML for more info.

      Update: You post reminded me to publish somewhere my patches for XML::LibXML which make xpath queries to documents with namespaces easier. I've just posted it to perl-xml mailing list, you may find it useful.

      --
      Ilya Martynov, ilya@iponweb.net
      CTO IPonWEB (UK) Ltd
      Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
      Personal website - http://martynov.org

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://242059]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2022-05-24 19:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (84 votes). Check out past polls.

    Notices?