Re^3: Regular expressions across multiple linesby afoken (Canon)
|on Apr 24, 2016 at 19:04 UTC||Need Help??|
Well, it may look so, but what really happens is different. See chomp:
This safer version of chop removes any trailing string that corresponds to the current value of $/ (also known as $INPUT_RECORD_SEPARATOR in the English module).
Note: Not a single word of the CR or LF control characters, the CR-LF pair, or NL (newline).
The input record separator $/ is documented, it defaults to an abstract "newline" character:
The input record separator, newline by default. This influences Perl's idea of what a "line" is. [...] See also Newlines in perlport.
Now, "newlines". Perl has inherited them from C, by using two modes for accessing files, text mode and binary mode. In text mode, the systems native line ending, whatever that may be, is translated from or to a logical newline, also known as "\n". In binary mode, file content is not modified during read or write. C has been defined in a way that the logical newline is identical with the native line ending on unix, LF. So, there is no difference between text mode and binary mode ON unix.
Quoting Newlines in perlport:
What happens here is that Perl has reasonable defaults for text handling, so it opens files (including STDIN, STDOUT, STDERR) in text mode by default, $/ defaults to a single logical newline ("\n"), and so native newline characters are translated before chomp just removed that "\n", on any platform.
When reading text files using a non-native line ending, things will usually go wrong:
Of course, it depends on the system you are using:
So, chomp is NOT cross-platform. It can handle input from native text files on all platform out of the box. But if you have to work with ASCII files with mixed line endings (CR, LF, CR-LF, LF-CR), chomp can't work reliably. This is not chomp's fault, neither is it perl's fault.
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)