http://qs321.pair.com?node_id=269283

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

In perlrun, it states that "The value 0777 will cause Perl to slurp files whole because there is no legal character with that value."

Well, with the advent of Unicode, there is indeed a character (octal)777, U+01FF, latin small letter o with stroke and acute.

So, does the -0 flag on the command line behave like \x or \x{} in strings? If the former, it doesn't let me set the input record separator to a sequence of bytes, which would be needed for a multi-byte character.

Files are open in 8-bit mode by default, for compatibility with older versions of Perl. But using one-liners with files opened automatically by while(<>) or piped in to standard input, how do I specify a different (e.g. UTF-8) encoding?

—John