John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:
In perlrun, it states that "The value 0777 will cause Perl to slurp files whole because there is no legal character with that value."
Well, with the advent of Unicode, there is indeed a character (octal)777, U+01FF, latin small letter o with stroke and acute.
So, does the -0 flag on the command line behave like \x or \x{} in strings? If the former, it doesn't let me set the input record separator to a sequence of bytes, which would be needed for a multi-byte character.
Files are open in 8-bit mode by default, for compatibility with older versions of Perl. But using one-liners with files opened automatically by while(<>) or piped in to standard input, how do I specify a different (e.g. UTF-8) encoding?
—John
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: perlrun -0777 option
by fglock (Vicar) on Jun 26, 2003 at 19:08 UTC | |
by John M. Dlugosz (Monsignor) on Jun 29, 2003 at 18:23 UTC | |
by fglock (Vicar) on Jun 29, 2003 at 21:21 UTC | |
Re: perlrun -0777 option
by crazyinsomniac (Prior) on Jun 27, 2003 at 03:55 UTC | |
by John M. Dlugosz (Monsignor) on Jun 29, 2003 at 18:24 UTC | |
Re: perlrun -0777 option
by Anonymous Monk on Jun 26, 2003 at 16:28 UTC | |
by John M. Dlugosz (Monsignor) on Jun 26, 2003 at 16:53 UTC | |
by nevyn (Monk) on Jun 26, 2003 at 17:25 UTC | |
by dataDrone (Acolyte) on Jun 26, 2003 at 16:29 UTC |
Back to
Seekers of Perl Wisdom