http://qs321.pair.com?node_id=627718

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,
I am using perl version 5.6.1
I have a problem in reading a file having unicode characters.
Please let me know how can i handle this data.

Is there any possible may to open the file in utf8 format as we use in perl5.8 + versions like
open(IN,"<:utf8","sample.txt");

Thanks

Replies are listed 'Best First'.
Re: Reading a file with utf8 data
by Zaxo (Archbishop) on Jul 20, 2007 at 06:33 UTC

    If UTF-8 data is important to you, you need to insist on having perl 5.8+. 5.6.1 was almost there, but not good enough.

    Your open statement would be fine in 5.8 aside from not checking return status. The difficulty is likeliest coming from the PerlIO mode argument. PerlIO wasn't the default until 5.8.

    After Compline,
    Zaxo

Re: Reading a file with utf8 data
by Juerd (Abbot) on Jul 28, 2007 at 12:07 UTC

    Use a newer Perl, really. 5.6.1 is 6 years old, and in software years, that's very old :)

    Also, use :encoding(UTF-8) when reading a file. If you use :utf8, perl assumes that everything is valid UTF8 data. When it happens to be invalid, for whatever reason, perl won't detect it. It will cause internal corruption, and that can theoretically lead to security problems and if you're lucky, crashes.

    :utf8 is safe for *writing* only.

    Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }