Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^5: Why does Encode::Repair only correctly fix one of these two tandem characters?

by ikegami (Patriarch)
on Aug 11, 2014 at 01:49 UTC ( [id://1096954]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Why does Encode::Repair only correctly fix one of these two tandem characters?
in thread Why does Encode::Repair only correctly fix one of these two tandem characters?

The most common garbage from Perl code is mixed UTF-8 and latin-1. It happens when you forgot to specify the output encoding.

print "\N{LATIN CAPITAL LETTER E WITH ACUTE}"; print "\N{BLACK SPADE SUIT}";

The first string consists entirely of bytes, so Perl doesn't know you did something wrong. The second string makes no sense, so Perl guesses you meant to encode it using UTF-8. You end up with a mix of code points (effectively latin-1) and UTF-8.

This is fixed using Encoding::FixLatin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1096954]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-28 18:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found