![]() |
|
Welcome to the Monastery | |
PerlMonks |
Re^3: How to Encode/Decode double encoded string.by haj (Curate) |
on Sep 22, 2020 at 08:59 UTC ( #11122067=note: print w/replies, xml ) | Need Help?? |
Thanks for the clarifications! It is relevant information that the stuff comes from a Postgres database. There's a lot of encoding done behind the scenes if a database is part of the game. Postgres has a configurable server encoding and a configurable client encoding, either one or both might have changed between the legacy and current application. The string � is an UTF-8-encoded version of the "Unicode replacement character". You get this by software which tries to decode strings as UTF-8 which contain non-UTF-8 characters, and then encodes this result as UTF-8. I guess that the decoding step gets fed with plain ISO-latin àáâä. There is a chance that the bogus decoding happens in Perl's Postgres database driver. You can check that by setting the DBH option pg_enable_utf8 to zero when connecting. Your application will then be able to examine the "raw" contents, and decode accordingly. A convenient way to examine strings is printf with the "v" format specifier: printf "%vx",$stringFrom there you can decide how to proceed. Probably you need to re-build the data with a consistent encoding.
In Section
Seekers of Perl Wisdom
|
|