Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
For example, "ways to rome" just does it wrong. Of course you shouldn't do it that way!

By any means, send an alternate way. No kidding. If the article can show a safer way to do it with regexps, then it should.

As a matter of fact I use regexps to process XML in very specific cases. First, when I am processing "not-quite-XML", which would make a parser choke, I use regexps to turn it into real XML. Then when I need to do things that XML modules (even XML::Twig!) can't do AND when I have created the XML myself. That is, it uses no entities/comment/weird stuff at all, it has no recursive tags and I know the order of attributes in tags. Then I use regexps (or I upgrade XML::Twig, see the new wrap_children method ;--)

The problem is people who don't know XML besides "it's HTML with my own tags" (and very few people actually know HTML that well), who use regexps for recuring processes, where the likelyhood of the XML changing in the future in ways that will break their code is quite high (<peet_peeve>mind you, the DOM has very similar issues</peet_peeve>). It is not a problem of regexps=bad, it's just a problem of knowing your tools, knowing the problem space (and there are indications that Tim Bray knows the problem space ;--) and knowing the limitations of the tools in the context in which you are using them. When you don't quite know the environment, better to play it safe and to use a parser than regexp.

In reply to Re: Re: Are you looking at XML processing the right way? (merge) by mirod
in thread is XML too hard? by thraxil

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (4)
    As of 2021-04-19 21:51 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found