Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Reading a file with utf8 data

by Anonymous Monk
on Jul 20, 2007 at 05:46 UTC ( [id://627718]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,
I am using perl version 5.6.1
I have a problem in reading a file having unicode characters.
Please let me know how can i handle this data.

Is there any possible may to open the file in utf8 format as we use in perl5.8 + versions like
open(IN,"<:utf8","sample.txt");

Thanks

Replies are listed 'Best First'.
Re: Reading a file with utf8 data
by Zaxo (Archbishop) on Jul 20, 2007 at 06:33 UTC

    If UTF-8 data is important to you, you need to insist on having perl 5.8+. 5.6.1 was almost there, but not good enough.

    Your open statement would be fine in 5.8 aside from not checking return status. The difficulty is likeliest coming from the PerlIO mode argument. PerlIO wasn't the default until 5.8.

    After Compline,
    Zaxo

Re: Reading a file with utf8 data
by Juerd (Abbot) on Jul 28, 2007 at 12:07 UTC

    Use a newer Perl, really. 5.6.1 is 6 years old, and in software years, that's very old :)

    Also, use :encoding(UTF-8) when reading a file. If you use :utf8, perl assumes that everything is valid UTF8 data. When it happens to be invalid, for whatever reason, perl won't detect it. It will cause internal corruption, and that can theoretically lead to security problems and if you're lucky, crashes.

    :utf8 is safe for *writing* only.

    Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://627718]
Approved by GrandFather
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-25 16:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found