Re: chomp() problems
by SparkeyG (Curate) on Sep 24, 2002 at 17:31 UTC
|
chomp will only remove the systems $/ ($INPUT_RECORD_SEPARATOR). In most Unix it is just \n, in DOS it's \r\n.
I guess you could fix the problem by:
{
local $/ = "\r\n";
chomp ($line);
}
Edited to correct typo, and again edited to correct a typo and again edited to fix a type. A sick infant does wonders to your sleep habits, and therfore your typing skils ;) Note to self, do not post anything after cleaning up baby vomit. ;) | [reply] [d/l] |
|
In most Unix it is just \n, in DOS it's \r\n.
No that's not correct. On DOS and Win32, it's also "\n". You see, the trick is that when reading from a text file, "\015\012" (AKA CRLF, "\r\n") is converted into a bare "\012" (LF, "\n"). Therefore, chomp() doesn't have to remove "\r\n", because there commonly will be, should be, only a bare "\n". And as chomp and $/ only use fixed strings not regexes for their workings, you can't have it both ways at the same time.
And that, boys and girls, is why it doesn't work here. Access does store line endings as CRLF pairs. And that isn't very Perl compatible. Therefore, when reading data from Access in a Perl script, you should always turn CRLF into "\n", and vice versa when storing data back into Access.
| [reply] |
|
| [reply] |
|
| [reply] [d/l] |
|
| [reply] |
Re: chomp() problems
by fruiture (Curate) on Sep 24, 2002 at 17:39 UTC
|
As `perldoc -f chomp` tells us, chomp <cite>removes any trailing string that corresponds to the current value of "$/"</cite>; so look at the value of $/. Because it's quite tricky how the "end of line" portability problem was solved (`man perlport`), i'd recommend you to use the explizit character escapes \015 and \012 instead of \r and \n:
{
local $/="\015\012";chomp
}
--
http://fruiture.de | [reply] [d/l] |
|
++fruiture! Twice, if I could.
I can't believe how fortified using \r\n is in some minds when dealing with foreign sytems' line terminators. In the last few days, this is the fourth post on that topic... It seems as if I stumbled into a crusade against \r\n (which, across systems, ah.., isn't).
Brothers and Sisters, this is harmful! Oh well, I might overstate this, but this does spoil portability! It was Aristotle, who asked me to beat him to it, just a short while ago... ;)
In this case, if you ported local $/ = "\r\n" to an EBCDIC-US system you'd chomp on CR followed by chr(37) ('#' in EBCDIC)! You need to obey the origins encoding, and for DOS that means that line endings are \015\012 and not your systems' \r\n.
So long,
Flexx
PS: SparkeyG, best wishes for your baby!
$happy_baby = not reverse(@food) and sleep($calmly) # ;)
| [reply] [d/l] [select] |
Re: chomp() problems
by BrowserUk (Patriarch) on Sep 24, 2002 at 17:32 UTC
|
Not quite sure why chomp isn't working, but your tr// isn't very wise as you are replacing the \r\n with 2 null (ascii 0) bytes which could bite (chomp:) back later.
I think tr/\r\n//; would probably be better.
If you posted the relevant bit of your code, it might be easier to see why chomp isn't doing what you expect.
Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!
| [reply] [d/l] [select] |
|
Personally, if it's not working but you have a work around, I'd go with the work around, although I would agree with the above comment and not use the NULLS. TMTOWTDI _____________________________________________________ mojobozo
word (wûrd)
interj. Slang. Used to express approval or an affirmative response to something. Sometimes used with up. Source
| [reply] |
|
OTOH, this could be a symptom of a larger problem which could surface again later, but might not be as easy to find. I suggest figuring out what it going on. Besides, having a better understanding of the code can't hurt, right? :-)
-disciple
| [reply] |
Re: chomp() problems
by charnos (Friar) on Sep 24, 2002 at 17:52 UTC
|
$line =~ tr/\r\n//d; should do what BrowserUK suggested..IIRC, the /d is required to delete characters. Checking the value of $/ would help ascertain why chomp() isn't working the way you expected.
Also, $line =~ s/\r\n$//; would probably more accurately replicate chomp()'s functionality.
Update: Thanks to bart, who reminded me that the $INPUT_RECORD_SEPARATOR is $/, not $\. There still appear to be quite a few incorrect vars floating around this thread. | [reply] [d/l] [select] |
|
I just ran into this recently with files being FTP'd from Windows to Linux, and I have to agree with Flexx: \r and \n are system-specific, where this is a byte-specific problem.
What I used was similar to charnos' substitution, but fixed to ASCII:
$line =~ s/\xD|\xA//g;
This removes all stray CR/LF from any ASCII to any ASCII. (Sorry, no EBCDIC support.) As long as Perl can figure out where the line breaks are, this will get rid of the odd bits.
-- Spring: Forces, Coiled Again!
| [reply] [d/l] |
|
Yup. Some like it hex... ;)
Update: Actually, what you wrote is EBCDIC compatible. It'll substitute DOS CRLF's on any system, be it an ASCII or an EBCDIC one. You're using discrete ordinals (hexadecimal ones, in your case) instead of the infamous logical symbols \r\n, and that's what makes your statement portable.
So long,
Flexx
PS: Someone downvoted this node (it's -1 by the time of this writing). Why? What did I do wrong here?
| [reply] [d/l] |