go ahead... be a heretic | |
PerlMonks |
Re: Regex to encode entities in XMLby mirod (Canon) |
on Jun 11, 2001 at 10:22 UTC ( [id://87409]=note: print w/replies, xml ) | Need Help?? |
Generating valid XML for the CB might actuallly be harder than it looks as I am not sure how easy it is to figure the encoding of the messages. The problem you have might be a bug in XML::Parser: If I use the regexp and then HTML::Entities I get the proper result with XML::Parser 2.27 but the wrong one with XML::Parser 2.30 (it looks like characters loose their UTF-8'edness with the latter). The solution is either to use Text::Iconv or the Unicode modules as described in my first post about encodings, or to go module lifting once again and to grab code from XML::DOM:
This will encode all non-ascii characters as &#nnn; where nnn is the code of the character in Unicode. This seems to display properly at least in Opera on Linux. Let me know if this solves your problem.
In Section
Seekers of Perl Wisdom
|
|