http://qs321.pair.com?node_id=178356


in reply to files saved in unicode are not being read correctly

Try Unicode::Map:
NAME Unicode::Map V0.112 - maps charsets from and to utf16 uni­ code SYNOPSIS use Unicode::Map(); $Map = new Unicode::Map("ISO-8859-1"); $utf16 = $Map -> to_unicode ("Hello world!"); => $utf16 == "\0H\0e\0l\0l\0o\0 \0w\0o\0r\0l\0d\0!" $locale = $Map -> from_unicode ($utf16); => $locale == "Hello world!"
you could then write something like this to read your unicode file (if it is utf16):
use strict; use Unicode::Map; my $Map = new Unicode::Map({ ID => "ISO-8859-1" }); while (<>) { print $Map->from_unicode($_); }

---- kurt

Replies are listed 'Best First'.
Re: Re: files saved in unicode are not being read correctly
by Anonymous Monk on Jun 30, 2002 at 14:21 UTC

    Thanks for your help kurt. I need one more help.

    In my application I get the list of files to open from the glob function.

    so before opening the file I should know whether to use Unicode::Map or the normal mode (depending on whether the file is saved in acsii or unicode.)

    Is that possible to do as well? what will be the impact if I use Unicode::Map for all files returned by the glob function?

    thanks for your help.

    regards,
    Abhishek.
      There's no 100% solution to your question, because even Unicode file may be considered as binary or even ascii.

      AFAIK usually unicode file starts with "\xFE\xFF" or "\xFF\xFE" but this is not always true.

      If a file is created by your own program, then, for example, use a special naming system, for example unicode text files let have an extension ".utxt" and ascii files just ".txt"

      Courage, the Cowardly Dog.

      sorry, I cannot answer this, anyone else ?

      ---- kurt