Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

How to read a Unicode file?

by ibanix (Hermit)
on Jan 23, 2004 at 23:03 UTC ( #323727=perlquestion: print w/replies, xml ) Need Help??

ibanix has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I've got a simple question. I have a file in Unicode that I want to read for input. Before knowing it was Unicode, I tried to read it in, but when I printed it back, I get spaces between each letter:

L i k e  t h i s  s e n t a n c e .

I looked at perlunicode but my head is swimming with locacles, encodings, and character sets. Can anyone help?


$ echo '$0 & $0 &' > foo; chmod a+x foo; foo;

Replies are listed 'Best First'.
Re: How to read a Unicode file?
by Zaxo (Archbishop) on Jan 23, 2004 at 23:14 UTC

    Chances are that if ASCII is turning up with interposed zero bytes, your file is in utf16. That is the default encoding on Windows.

    Perl 5.8 is pretty smart about unicode.

    After Compline,

      Thanks for the UTF-16 tip. I found that

      open(FILE, "<:encoding(UTF-16LE)", $file)

      did the magic for me.

      $ echo '$0 & $0 &' > foo; chmod a+x foo; foo;
Re: How to read a Unicode file?
by Aragorn (Curate) on Jan 23, 2004 at 23:20 UTC
    perluniintro is a gentler introduction to Perl and Unicode. I'm by no means an expert, but I think that, depending on the encoding, you can use open(my $fh, "<:utf8", "file") or something like open(my $fh, "<:encoding(ucs2), "file").


for more detail ...
by g00n (Hermit) on Jan 25, 2004 at 03:54 UTC

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://323727]
Approved by Aragorn
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2022-08-09 01:17 GMT
Find Nodes?
    Voting Booth?

    No recent polls found