Clear questions and runnable code get the best and fastest answer |
|
PerlMonks |
Re: Cleaning up non 7-bit Ascii Chars for XML-processingby ikegami (Patriarch) |
on Nov 11, 2010 at 19:06 UTC ( [id://870917]=note: print w/replies, xml ) | Need Help?? |
You're getting stuff as cp1252 — "’" is 92 in cp1252 — but you're outputting it as is in a document you claim is UTF-8. Always decode your inputs. Always encode your outputs. You are apparently doing neither. Note that the quote is character U+2019, so the proper escape is ’ or ’, not \. If you pass properly decoded text to the following function, it will produce 7-bit clean UTF-8 (aka US-ASCII) XML text and XML attribute values.
In Section
Seekers of Perl Wisdom
|
|