Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^3: Why oh why is working with XML so bloomin' difficult in Perl?

by wilsond (Scribe)
on Jan 21, 2009 at 08:50 UTC ( #737777=note: print w/replies, xml ) Need Help??


in reply to Re^2: Why oh why is working with XML so bloomin' difficult in Perl?
in thread Modified title: The structures created by many of the XML parsers in Perl appear unnecessarily deep in levels...

If XML is difficult for you to parse, maybe your XML isn't structured properly. I don't know your application or where your XML comes from, so that makes it difficult to judge. XML is just a way to structure data. It can be as flexible or strict as you need it to be (within your app) and shouldn't be more flexible than it needs to be. Parsing XML using DOM methods will be much more difficult than just using something like XML::Simple or the like, but that's the way many languages do it. There are a lot of modules on CPAN that try to make it easy for you. I feel that they do a very good job of that.

Again, if your parsing seems difficult, maybe the XML that you're parsing is just a bit too flexible for your needs. And I've yet to hear what exactly is difficult about it in Perl. I hear you repeating the same thing about it being difficult, but haven't heard any examples of why you believe it to be difficult.

On a side note, I'd recommend modifying your title to something a little less irritated, otherwise you'll get even more of these defensive replies.


While I ask a lot of Win32 questions, I hate Windows with a passion. That's the problem with writing a cross-platform program. I'm a Linux user myself. I wish more people were.
If you want to do evil, science provides the most powerful weapons to do evil; but equally, if you want to do good, science puts into your hands the most powerful tools to do so.
- Richard Dawkins
  • Comment on Re^3: Why oh why is working with XML so bloomin' difficult in Perl?

Replies are listed 'Best First'.
Re^4: Why oh why is working with XML so bloomin' difficult in Perl?
by jfroebe (Parson) on Jan 21, 2009 at 14:41 UTC

    Good points. As Jeffa pointed out, it isn't just Perl that can have complex structures (hashes of hashes, etc) created from the XML Parsers. What is the frustrating part is that in several of the XML parser modules, the structures can become unnecessarily complex. Given the unknown data coming in, most of the parsers create the memory structures in a generic way. Not a bad thing, it just tends be cumbersome.

    One problem that can arise is that the information you are looking for may not be at a consistent location within the structure which would eliminate any XML Path tools. If the data is properly tagged, then you can use rules or similar based tools. These were mentioned in earlier replies so I'm not going to rehash them.

    How is this the fault of Perl? It's not. Perl, itself, knows nothing about XML. The wide variety of XML parsers within CPAN does give us a clue that the problem is really with how flexible XML can be and how some XML data sources can really make parsing it very difficult. The XML parsers do their best at parsing it but there is no best XML parser module for the majority of the XML data sources.. in some simplistic cases, such as Flickr's REST web service, XML::Simple works just fine and produces data structures that are well suited for the simple XML data. In others..

    Is this a bit more clear?

    The title is now "Modified title: The structures created by many of the XML parsers in Perl appear unnecessarily deep in levels..." which I hope is a bit less inflammatory. Otherwise, we'll have to get a bucket full of valium for some perlmonks ;-) (Just teasing)

    Jason L. Froebe

    Blog, Tech Blog

      The reason that the "memory structures (are created) in a generic way" is because it's parsing generic data. You will never find a bit of code on this planet that will ever take your totally unknown XML and turn it into a non-generic data structure. The only way it could be possible is if you provide a schema of some sort or write your own parser (or subclass one).

      What I gather from your comments is that your data being parsed isn't something you generated yourself, but something some other app out there generated. Can you give an example of the data that you're parsing? It might help to give you a pointer in the right direction.


      While I ask a lot of Win32 questions, I hate Windows with a passion. That's the problem with writing a cross-platform program. I'm a Linux user myself. I wish more people were.
      If you want to do evil, science provides the most powerful weapons to do evil; but equally, if you want to do good, science puts into your hands the most powerful tools to do so.
      - Richard Dawkins

        Agreed. I should have put in a "I'm venting here" message to take the bite off.

        Most of the XML data I'm dealing with is internal XML at work (sometimes I receive a dtd but sometimes I receive nothing more than raw output). I'm using different parsers depending on what I'm receiving and what I'm doing with it.

        XML::Simple is going to be fine for simple web services like Flickr REST though.

        Jason L. Froebe

        Blog, Tech Blog

      "Given the unknown data coming in, most of the parsers create the memory structures in a generic way. Not a bad thing, it just tends be cumbersome."

      That's the job, pal. If you don't like it, switch careers. Or even better, come up with a replacement for XML that isn't so "hard to use." Sounds like you are the one who needs that Valium. (Just teasing)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://737777]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2022-01-28 22:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (74 votes). Check out past polls.

    Notices?