You can also have a look at the Module Reviews for XML modules and Ways to Rome, an article that solves the same problem using various XML modules.
The problem is that there is a lot of overlap between the various modules. Some cannot be used in certain circumstances, but for any particular problem there are at least 2 or more modules that will work. Basically it boils down to how much you like the interface of any module.
A quick overview would be:
- XML::Parser: the basic, most of the other modules are built on top of it, fast, low-level (can be a pain to use),
- XML::Simple: quite simple, robust, widely-used, tree-based (hence can be slow on big files and cannot deal with huge ones), does not work for document-oriented XML,
- XML::DOM: ugly, tree-oriented, widely used, not actively maintained at the moment, follows a W3C standard, can be a pain to install (BTW, if you are interested by the DOM I have started writing a little helper module for it, named... XML::DOM::Twig),
- XML::PYX: line-oriented, fast, not convenient for complex transformations,
- XML::XPath: powerful, getting faster and faster, very well supported (by Matt Sergeant, the most prolific XML developper around),
- XML::Twig: Perlish, DWIMy, can deal with huge documents, you know what I think of it ;--)
There are others too: XML::RAX for record-oriented XML, XML::Dt, XML::SimpleObjects...
In any case I think we're heading towards big changes in the XML module landscape. XML::Parser is not a SAX-based parser (it predates SAX actually), which is a pain, and it is quite a pain to install (based on expat, an external library). I think we will see new modules based either on a pure Perl SAX parser (there is one in SOAP::Lite) or on libXML, the Gnome XML library, plus existing modules being ported to interface with those 2 kinds of SAX parsers.
So I guess it will always be very difficult to give a "decision-tree" to choose a module, and in any case it is too early...
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|