Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Choosing the best XML module for a toy system

by fernandes (Monk)
on Sep 11, 2008 at 19:01 UTC ( [id://710715]=perlquestion: print w/replies, xml ) Need Help??

fernandes has asked for the wisdom of the Perl Monks concerning the following question:

CPAN has many XML::* modules. I have never used any of them. In your experienced opinion, which should I choose for creating and reading XML data, in a prototype (toy system) application.

Thanks!
  • Comment on Choosing the best XML module for a toy system

Replies are listed 'Best First'.
Re: Choosing the best XML module for a toy system
by Your Mother (Archbishop) on Sep 11, 2008 at 19:47 UTC

    XML::LibXML and XML::Twig are top two. Can't go wrong with either. The first is more DOMy, the second more DWIM/Perly.

      I used to go with XML::Twig, but for some reason I started to get annoyed with it. I guess it's a matter of taste. I always use XML::LibXML anymore. It's a bit harder, maybe, to get started with, though. For example, if you look in the SYNOPSIS of its docs, it's difficult to know how to use it. You have to know its classes are (nicely) broken down mostly like the XML DOM, so you sometimes have to search for docs.

      Here's maybe a quickstart guide, at least how I use it (I usually encapsulate parts of this in functions):

      use XML::LibXML qw(:all); my $parser = XML::LibXML->new(); # this could also be parse_string my $xmldom = eval { $parser->parse_file($file) }; if ($@) { # problem parsing, die $@, etc. } # this is the outermost element my $doc = $xmldom->documentElement; # the rest might be familiar if you've used # the DOM in JavaScript # findnodes is very useful, but hard to find # (look in XML::LibXML::Node); this assumes there's # XML like <asset><story>...</story><story>...</story></asset> my $xpath = 'asset/story'; # this can be whatever XPath my @story_nodes = $doc->findnodes($xpath); foreach my $story_node (@story_nodes) { my $id = $story_node->getAttribute('ID'); # this is an example where DOM can be annoying my ($uri_node) = $story_node->getChildrenByTagName('URI'); $uri_node->normalize(); my $uri = $uri_node->firstChild->getData(); # should 1st check i +t's ::Text ! $uri =~ s/^(http://)[^.]+(\.example\.)/$1new$2/; $uri_node->setData($uri; # .... } # another thing not so obvious, this is in ::Document # there are several variations, toFH, toString, etc.. $doc->toFile($newfile);

        Sure. Part of what I like about XML::LibXML is that my JavaScript chops improved as a side-effect of working with it.

Re: Choosing the best XML module for a toy system
by toolic (Bishop) on Sep 11, 2008 at 19:54 UTC

    I admittedly have little XML experience, but ever since I discovered XML::Twig, I have used it exclusively for XML parsing (reading and modifying). Here are some reasons:

Re: Choosing the best XML module for a toy system
by dHarry (Abbot) on Sep 12, 2008 at 07:02 UTC

    It depends on what you are going to do; you have to plan a little ahead. Choosing one approach and then later regret it is best avoided ;-) I suggest you make a design first and identify what you really need from the XML arsenal. You only mention the creating and reading of XML data but you probably also want to update and delete parts of the XML documents.

    It very much depends on the complexity of your application. What are the required XML features? Do you want to use XPath expressions, apply transformations with XSLT, will you need a validating parser, i.e. are you going to validate your XML documents against some schema (DTD, XMLSchema, RelaxNG, Schematron etc.). What is the expected size of the XML documents etc. etc.

    I started out using XML::Simple but soon ran into trouble. It’s only suitable for simple XML stuff (as the name suggests of course!). So for a simple prototype it might be a good choice but don’t expect miracles from it.

    As a Windows user my recent experience with XML::LibXML is not too good. In fact I submitted a bug to ActiveState:-( My experience on other platforms is outdated, i.e. 5 years old. At that time some of the standards were only partially implemented which caused us some headaches. I can’t really give a verdict on the current status.

    Lately I have been using XML::Twig a lot. However beware, it has a * huge * number of methods and it takes some time to figure out how to use it effectively. A good starting point is http://xmltwig.com/.

    Maybe in your situation it is worth considering a proper/native XML database. One of those is eXist. My experience with it is mainly from a Java environment but it seems you can use from Perl as well.

    Hope this helps.

Re: Choosing the best XML module for a toy system
by Devanchya (Beadle) on Sep 11, 2008 at 23:12 UTC
    I have been using XML::LibXML almost exclusively lately. It seems to be the most commonly installed XML parser I found. I played with Twig, it was nice... but on a lot of web hosts it is not installed. Do not know why.
    Even smart people are dumb in most things...
      It seems to be the most commonly installed XML parser I found.
      FYI, the most common is XML::Parser :)
Re: Choosing the best XML module for a toy system
by graff (Chancellor) on Sep 12, 2008 at 05:03 UTC
    I've had (rare) occasion to try both XML::Twig and XML::Simple, but not enough to judge either with any authority. Still, based on what keeps popping up here at the monastery about Twig, I will return to it for another try the next chance I get; meanwhile, the man page for XML::Simple, all by its ponderous self, is enough to convince me that it's name is not very apt, and I won't be trying it again.

    But I've noticed that these modules, like many others, are built as layers on top of XML::Parser -- and wouldn't you know, it's pretty handy, too. In fact, most of the XML apps I've written so far have used just that. It's maybe not the easiest thing to get your head around, but it's certainly easier that XML::Simple. And for those of us who grew up programming relatively "close to the hardware", it's kind of nice to be using the fundamental tools on the fundamental objects. It's also nice, I think, for getting a good understanding about what goes on "under the hood" with those extra-layer modules.

    For that matter, depending on what you actually need to do with your XML, the basic module might lead you to simpler, more direct, and more efficient solution. It's definitely worth a look.

Re: Choosing the best XML module for a toy system
by LesleyB (Friar) on Sep 11, 2008 at 23:05 UTC

    I found XML::Simple easy for converting existing hashed data into XML format - but that is all I used it for

    I haven't used either of the above mentioned modules yet.

      XML::Simple generates more questions here than any other single module. It is simple if you are doing exactly what the defaults expect you to do, but no-one ever does and they all get into trouble as a result. Better to spend the time to learn how to use XML::Twig and use it for the simple stuff and later on for the complicated stuff.


      Perl reduces RSI - it saves typing

        Thank you for that information.

        I only used it to convert some hashed data in pre-existing code into XML format and didn't go any further than that; otherwise I am sure I may have discovered its limitations

Re: Choosing the best XML module for a toy system
by BaldManTom (Friar) on Sep 12, 2008 at 05:12 UTC
    I'm going to also give my vote to XML::Twig. It's pretty easy to do easy stuff with, but with a lot of room to grow, once you decide you'd like to turn your "toy" into something less toy-ish. I've used it for a number of years for a variety of purposes and have found that I can pretty do whatever I need with it.
Re: Choosing the best XML module for a toy system
by runrig (Abbot) on Sep 12, 2008 at 16:13 UTC
    I've come to like XML::Rules. I also still like XML::Twig. But when I just needed to alter some attribute values in some not-especially-large (<200KB) files, I tried both of those modules, but ended up going with XML::LibXML because XPath made finding the desired attributes that I wanted to change incredibly simple (Update: And I didn't know about XML::Twig::XPath...though it's still worthwhile learning XML::LibXML).

      Note that XML::Twig::XPath, which comes with XML::Twig, gives you a proper XPath engine (it's the one from XML::XPath).

Re: Choosing the best XML module for a toy system
by AZed (Monk) on Sep 13, 2008 at 17:10 UTC

    One of the things that I haven't seen mentioned yet in the XML::LibXML vs XML::Twig discussion is that XML::Twig is available via the standard ActiveState Perl distribution, but XML::LibXML isn't (though you can get it by adding the uwinnipeg repository). This means that if you're planning to use LibXML once you leave the prototype stage, but your end product is supposed to be usable on MSWindows boxes later, you'll have to add an extra hoop for end-users to jump through to get it working.

Re: Choosing the best XML module for a toy system
by djp (Hermit) on Sep 16, 2008 at 05:52 UTC
    I investigated several XML modules a couple of years ago, including the ones mentioned above, and ended up choosing XML::Smart. I recommend it.

      I must strongly disagree that XML::Smart is a good idea, unless your needs are trivial — it is one of the buggiest Perl XML handlers I have had the misfortune to attempt to use. The mere act of setting one element equal to another caused some of the methods to stop working after the assignment. I wish I could provide a more concrete example, but I nuked the code I was testing with after I gave up on making it work reliably.

Re: Choosing the best XML module for a toy system
by fernandes (Monk) on Sep 17, 2008 at 01:39 UTC
    Thanks a lot for so many kind suggestions.

    By the way, and if you don't care a little naïf Mediation here, while I started to use Twig, I thought it would be useful to the CPAN users a competition between the modules that eventually perform similar tasks. Although CPAN (somewhere) recommends the user should only submit modules which aims are not yet covered by other even accepted module, I think people tends by several reasons to bypass the procedure of sending suggestions to older authors. I think this is not bad, but an objective weighting method could do the job of choosing the better one easier.
    Kwalitee cares only about - let’s say - formal requirements, or syntactic as opposed to semantic behaviors. The *.t framework only guarantees the module will not fail in your machine. So, we don’t have a real rating system. (If I’m totally wrong, please let me know).
    A simplistic approach to such semantic competition between modules able to do the “same” job could be to make a bench mark for each module on the same tasks. Several input sizes could be used to accomplish a detailed description. Some kind of metadata like w3c OWL would need to be attached to the doc of the modules for making possible the automatic creation of input and scripts. Etc. What do you think?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://710715]
Approved by planetscape
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-18 06:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found