Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Getting started with XML

by Anonymous Monk
on Jul 25, 2005 at 01:16 UTC ( [id://477639]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on an application that needs to communicate with an outside system using XML. This is my first foray into XML, so I'm not quite sure how to get started. I've looked through the XML modules on CPAN, but with my limited XML knowledge they all seem pretty intimidating.

I've got DTDs for all of the messages that I will need to parse and construct. I'm thinking about using HTML::Template to generate the XML that I send, but I'm not sure about the best way to parse the incoming XML.

Ideas?

Replies are listed 'Best First'.
Re: Getting started with XML
by saintmike (Vicar) on Jul 25, 2005 at 02:03 UTC
    The easiest way to get started is with XML::Simple. It reads in XML and transforms it into a perl data structure:
    use Data::Dumper; use XML::Simple; my $ref = XMLin(\*DATA); print Dumper($ref); __DATA__ <result> <cd serial="001"> <artists> <artist>Smashing Pumpkins</artist> </artists> <title>I Am One</title> </cd> <cd serial="002"> <artists> <artist>Foo Fighters</artist> <artist>Smashing Pumpkins</artist> </artists> <title>The Shield Soundtrack</title> </cd> </result>
    prints out
    $VAR1 = { 'cd' => [ { 'artists' => { 'artist' => 'Smashing Pumpkins' }, 'serial' => '001', 'title' => 'I Am One' }, { 'artists' => { 'artist' => [ 'Foo Fighters', 'Smashing Pumpkins' ] }, 'serial' => '002', 'title' => 'The Shield Soundtrack' } ] };
Re: Getting started with XML
by jhourcle (Prior) on Jul 25, 2005 at 01:46 UTC

    I'd suggest reading the XML specs, and learn to read DTDs.

    Yes, you can use XML without understanding its intricacies, but there's no real substitute for actually knowing what you're working with.

    Everyone has their own favorite XML modules in CPAN ... search on 'XML' in the Super Search, and you'll find lots of recommendations. Without knowing the intricacies of what you're trying to parse and/or create, I'd be doing a disserve to suggest any particular one.

Re: Getting started with XML
by srdst13 (Pilgrim) on Jul 25, 2005 at 02:03 UTC
    There are numerous techniques for dealing with XML. If you have perl data structures that you want to pass, try XML::Simple. There are many more in the XML:: namespace. A CPAN search is probably in order. HTML::Template is a possibility, but wouldn't be my first thought (my bias).

    Sean
Re: Getting started with XML
by planetscape (Chancellor) on Jul 25, 2005 at 05:19 UTC

    Not specific to Perl, but tips for getting the lay of the XML land:

    Altova offers a free "Home Edition" of XMLSpy, which can be a very handy visual aid for those learning about XML, what it looks like, how to edit it, how to get started with "offshoots" like XML Schema, XSL/XSLT, etc.

    refcards.com has a few reference cards on XML and related technologies. (I learned about that site from Perlreref as formatted quick reference.)

    Zvon has many similar materials, freely downloadable, comprehensive, in various formats and numerous languages.

    As far as learning about new modules and/or code snippets, I have found the bundled documentation helpful ;-). But don't forget about utilities such as POD2HTML... Other helpful creatures in this beastiary include podgen, DoxyFilt* (Doxygen for Perl), and Perl::Tidy.

    Some books I have found useful are: XSL Essentials, XSLT, XSLT Cookbook, and of course, The Perl Cookbook, 2nd Ed.

    HTH,


    Update: See also Perl-XML Frequently Asked Questions and Frequently Asked Questions about XML::Simple


    * Update: 2005-12-28 Kudos to both john_oshea and tfrayner for alerting me to the fact that my link above has been rendered usless by the foul creatures known as spammers... I have found what appears to be a good link to obtain DoxyFilt; the most recent version seems to be from August 24, 2005: Doxygen-0.84.tar.gz. Thanks again, guys!

    * Update: 2006-03-11 Like Zvon, deepX has a good collection of "Quick References" on XML-related technologies such as XPath, XSD, XSL and XSLT (and even CSS and MySQL!).

    planetscape
Re: Getting started with XML
by Molt (Chaplain) on Jul 25, 2005 at 09:15 UTC

    I appreciate I'm going against a lot of previous recommendations here, but I really don't think module::XML::Simple is that simple. Whilst the interface is nice and simple the actual Perl data structure that comes back can seem amazingly crufty, with things changing from scalars to array refs when there's more than one subnode, and othersuch trickery. I ended up giving up on this for that reason, I appreciate though that others do have different opinions on this (Re: Vi vs. Emacs) so just take this as an alternate view.

    My personal recommendation for handling XML, regardless of your initial level of proficiency, is XML::LibXML. This is a very large and complex module, and can be a nightmare to get working due to it having a C library dependancy (libxml, oddly enough). Where it does shine is by providing a simple interface to XPath in the XML::LibXML::node::findnodes() command. XPath is the only technology I've seen which makes pulling out stuff far down an XML structure easier than pulling teeth. Another, possibly lesser, advantage is that if you should ever now need to do some heavier-duty stuff in XML you have experience in one of the most adaptable XML modules- less retraining time.

    You may also want to have a look at the "Perl and XML" book by the shiny O'Reilly people if you're thinking you'll end up doing a lot of XML, it's a thin book but it does allow you have an overview of a lot of different approaches.

Re: Getting started with XML
by BaldPenguin (Friar) on Jul 25, 2005 at 19:42 UTC
    What you want will depend a lot on what you will do and what you will store. XML can be a real memory hog with large data sets. I have used XML::Simple, XML::Element and XML::LibXML. XML::Simple is just that a very simple mod for accessing and creating simple XML files while XML::Element is great for creating XML, but I fin it more difficult to use if you need to traverse any part of the document after you create it. XML::LibXML, in my opinion, has only one drawback, it does everything and sometimes the overhead can be more than you need.

    If you need to process large docs, I would recommend XML::Twig. I have found it to be very fast at traversing the DOM and easily maniputable in it's memory usage.For the longest time, I would use XML::Element to build, and XML::Twig to parse. It's not the only way, just the way I used.

    One benefit I have found to using XML::LibXML though, is that the methodoligy and accessors are are very close, if not the same as those used to access XML in many other languages. If you have to use different languages from time to time, it's nice to have a common ground.

    Just my opinion, but that is what you asked for.

    Don
    WHITEPAGES.COM | INC
    Everything I've learned in life can be summed up in a small perl script!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://477639]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-25 15:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found