Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^5: How to sanely handle unicode in perl?

by Your Mother (Archbishop)
on Mar 20, 2015 at 19:10 UTC ( [id://1120793]=note: print w/replies, xml ) Need Help??


in reply to Re^4: How to sanely handle unicode in perl?
in thread How to sanely handle unicode in perl?

\xc3\xb6 is not the right byte(s) for an ö from a Latin-1 terminal, it is the UTF-8 encoding. Meaning it can only be issued by a UTF-8 encoded source (and still mean ö). So what you are asking to do sanely, strikes me as…strange. If it is coming from a Latin-1 encoding source it would be \xf6. To do encoding properly you have to know what you are receiving, decode it with that, and know what your output layer is, encode it to that. It’s not easy but it’s not magical either. Without the right steps at the right layers it’s literally guesswork and impossible to do robustly.

Replies are listed 'Best First'.
Re^6: How to sanely handle unicode in perl?
by Sec (Monk) on Mar 23, 2015 at 10:26 UTC
    Please check the source. I explicitly state that the pipe that produces \xc3\xb6 is utf-8. So what you wrote does not apply to my code.

    In fact choroba found out that it works as intended if I prepend ":raw" to the encoding. (Which is unintuitive to me, but kind of makes sense in retrospect)

      Maybe you misunderstand my point. If you run that code in a Latin-1 terminal you are sending UTF-8 and expecting it to act properly. It makes no sense and can’t work without goofy and unrealistic hoops.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1120793]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-19 12:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found