Re^2: Encoding horridness

Replies are listed 'Best First'.
Re^3: Encoding horridness by Corion (Patriarch) on Jul 12, 2017 at 14:20 UTC
No, because high-bit characters/octets in Latin-1 encode differently as octets in UTF-8, and Perl doesn't know what to do with high-bit characters when writing them.	[reply]
Re^4: Encoding horridness by Anonymous Monk on Jul 12, 2017 at 14:26 UTC
What I'm wondering, though, is if there's ever a situation where `encode('utf8', decode('Latin-1', $_))` [download] produces different output from `encode('utf8', $_)` [download]	[reply] [d/l] [select]
Re^5: Encoding horridness by choroba (Cardinal) on Jul 12, 2017 at 16:51 UTC
Yes, for example: `$_ = decode('utf-8', "\N{LATIN SMALL LETTER A WITH ACUTE}"); say encode('utf8', $_); # Replacement character EF +BFBD. say encode('utf8', decode('Latin-1', $_)); # Dies.` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^6: Encoding horridness by Anonymous Monk on Jul 12, 2017 at 17:55 UTC
Re^3: Encoding horridness by hippo (Bishop) on Jul 12, 2017 at 14:16 UTC
The OP wants to move from Latin-1 to UTF-8. Latin-1 is not a subset of UTF-8.	[reply]
Re^4: Encoding horridness by Anonymous Monk on Jul 12, 2017 at 14:20 UTC
Yes, and `encode('utf8', decode('Latin-1', $_))` isn't a no-op.	[reply]