in reply to Re: Re: reading unicode files
in thread reading unicode files
The easiest (for me) way to decide if your file is utf-16 or ucs-2 (see below) is to look at it, using something like:
If you see something (like smileys, or whitespace) between each latin letter, it's either of the two encodings above, otherwise it isn't (this assumes you have latin letters in your file)C:\> more < thefile
To read them: (I was not very clear)
or whatever encoding you want. The :utf8 spec is a sort of shorthand for the full :encoding() spec...open FILE,'<:encoding(utf-16)','filename';
ucs-2 is a degenerate form of Unicode encoding, since it can not represent character beyond the first 2^16. It is more-or-less compatible with utf-16 for those, so you might not notice the difference. Anyway, don't use to write new files (please ;-) )
-- dakkar - Mobilis in mobile
|
---|
In Section
Seekers of Perl Wisdom