Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"


by spm (Acolyte)
on Sep 04, 2001 at 16:23 UTC ( [id://110017] : CUFP . print w/replies, xml ) Need Help??

I needed a little script to translate my HTML code to XHTML compliant.
I wrote this little script to do it for me, but it doesn't do everything I wanted it to - yet. At the moment, it just speeds up the process a bit. Could someone give me pointers to how I could implement the last few features?
#!/usr/bin/perl -w use strict; while(<>) { s/<(\S*)(.*)>/<\L$1\E$2>/g; s/<\/(\S*)>/<\/\L$1\E>/g; # s/(\w+)=([^\s\"]*)/$1=\"$2\"/g; # XHTML enforces quotes arou +nd parameter values. (needs fixing) # XHTML enforces closing slashes, i.e. <br/> or <td></td> print; }

Replies are listed 'Best First'.
by mirod (Canon) on Sep 04, 2001 at 16:35 UTC

    Once again... ;--( Please do not use regexps to process HTML or XML. There is more to it than what you think.

    So use modules like HTML::Parser or tools like tidy. Believe me it will save you a _lot_ of trouble. I think you have better things to do with your time than to write a proper parser! (if you don't then please read the specs... and reconsider your decision ;--)

by Hofmator (Curate) on Sep 04, 2001 at 16:30 UTC

    You might consider a proper way of doing it ... regexes get just more complicated and still don't get everything right in this case. mirod had a very recent post about this topic: Re: HTML and XML.

    -- Hofmator

      Ohh, cool. I never tought of checking weather tidy could do it. Thanks.