Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Re: Handling Mac, Unix, Win/DOS newlines at readtime...

by graff (Chancellor)
on Sep 16, 2002 at 04:12 UTC ( [id://198146]=note: print w/replies, xml ) Need Help??


in reply to Re: Handling Mac, Unix, Win/DOS newlines at readtime...
in thread Handling Mac, Unix, Win/DOS newlines at readtime...

/\r\n?/ will fail to split lines that were created on unix systems. Eliminating blank lines might not be so bad, but if it's an issue, then:
split(/\r\n|\r|\n/);
Just doing /[\r\n]{1,2}/ will lose some blank lines on unix or mac input; and it's important to try to match the longer pattern first.

Replies are listed 'Best First'.
Re: Re: Re: Handling Mac, Unix, Win/DOS newlines at readtime...
by jkahn (Friar) on Sep 16, 2002 at 04:17 UTC
    but what if a file was created on a Windows machine, but this code was being run on a Mac?

    I remember reading somewhere in this thread that \r and \n have reversed semantics on the Mac (vs. *nix, Windows).

    So maybe we really want the following: split(/ \r\n | \n\r | \r | \n /x); # (yoicks!)

    My $0.02,

    -- jkahn

      but what if a file was created on a Windows machine, but this code was being run on a Mac?

      It wouldn't matter which type of system was running the perl code.

      I remember reading somewhere in this thread that \r and \n have reversed semantics on the Mac (vs. *nix, Windows).

      Um, no, that statement hasn't been made on this thread. My own experience has been that MS systems use "\r\n", all .n.x systems use "\n" and (older) Mac systems use "\r". Nobody uses "\n\r".

      And now that MacOS-X is out with a unix foundation, maybe the number of variants will reduce to just two instead of three.

        Nobody uses "\n\r".

        This reminds me days when I just started using Perl. One of my first real scripts was supposed to handle both Unix and Win32 line endings in input text files. However one day it borked because of input file having "\n\r". No idea where such line ending came from though.

        --
        Ilya Martynov (http://martynov.org/)

        I remember reading somewhere in this thread that \r and \n have reversed semantics on the Mac (vs. *nix, Windows).

        Um, no, that statement hasn't been made on this thread.

        Oh yes it has. On a Mac, "\n" is "\015" (native end-of-lines), and "\r" is the other one, "\012". The sequences you can encounter, in Ascii, are matched by /\015\012|\015|\012/.

        In addition, "\015\015\012" can occur too, in HTML pages you can download from the web, because of an erroneous FTP upload from Windows to Unix, as binary — and a download as text, which adds another CR.

      Yes, Macs have a backwards notion of what \r and \n are in ASCII (was this changed in OS X?) However, if the orginal poster is running the Perl script on a *nix or Windows box, it shouldn't matter.

      BTW--My favorite way of dealing with the Mac's reversed notion of CR and LF is to use the octal ASCII value instead. \r = \015, \n = \012 (IIRC). You'll probably have issues with Unicode, though.

        Yes, this was changed in OS X.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://198146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-23 23:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found