Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^2: UTF-8 issues with Perl in general and with Spreadsheet::WriteExcel

by elef (Friar)
on Jul 16, 2010 at 15:20 UTC ( [id://849993]=note: print w/replies, xml ) Need Help??


in reply to Re: UTF-8 issues with Perl in general and with Spreadsheet::WriteExcel
in thread UTF-8 issues with Perl in general and with Spreadsheet::WriteExcel

Well, I only use STDIN to get user input, which is always just one line, which I store in a variable and then use it for whatever purpose later... So I don't see how a while loop would be useful. Anyway, the more I know about this stuff, the less I understand it. I tried just adding binmode STDIN, ':encoding(UTF-8)'; to the script above, now I get a different problem: error messages of this sort: utf8 "\xFB" does not map to Unicode at [script] line 8. The output file contains the character codes instead of the characters: \xFB\x{32CB8E1}\x82\xA0

Maybe I should be using encode() and decode() but I just don't know how they relate to "use utf8", and "binmode :encoding(UTF-8)". This is a huge mess and I feel like I'm having to fight a hundred dragons just to get some damned characters to display correctly. Why everything isn't in UTF-8 in the first place is beyond me, it's 2010 for God's sake!

Anyway, I ran the test from your link ( http://perlgeek.de/en/article/encodings-and-unicode ) as well. The results are not good: all 4 lines are mojibake. The dragons are clearly winning.

Replies are listed 'Best First'.
Re^3: UTF-8 issues with Perl in general and with Spreadsheet::WriteExcel
by moritz (Cardinal) on Jul 16, 2010 at 16:52 UTC
    So I don't see how a while loop would be useful

    It was an example, with the purpose of demonstrating that you need to set the IO layer only once, and not before every reading operation. Of course you are welcome to deviate from the example.

    utf8 "\xFB" does not map to Unicode at script line 8.

    That means that your input is not in UTF-8. Find out which character encoding it is, and use the name in the :encoding($encoding_name) IO layer.

    Maybe I should be using encode() and decode() but I just don't know how they relate to "use utf8", and "binmode :encoding(UTF-8)".

    use utf8; has the same effect as adding a decode_utf8 before every string literal in your program. the :encoding(UTF-8) IO layer has the same effect as wrapping input operations in decode calls and output operations in encode calls.

    The results are not good: all 4 lines are mojibake.

    Then your next step should be either to find out which character encoding your terminal works with, or set it up to use UTF-8.

    Perl 6 - links to (nearly) everything that is Perl 6.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://849993]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-26 03:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found