Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: The future of Text::CSV_XS - TODO

by Tux (Canon)
on May 30, 2007 at 08:17 UTC ( [id://618129]=note: print w/replies, xml ) Need Help??


in reply to The future of Text::CSV_XS - TODO

Small updates

I now understand the fuzz people make about embedded newlines. Text::CSV_XS has always been able to deal with that (well, maybe not always, but at least for a long time already). The problem that people might have is reading the line in the perl script. Obviously,

my $csv = Text::CSV_XS-New ({ binary => 1 }); while (<>) { $csv->parse ($_); my @fields = $csv->fields ();

Will horribly fail, as <> will break too early.

The most recent snapshot now contains a t/45_eol.t, that tests all possible eol combinations. Have a look to see that the way to parse CSV with embedded newlines should be done similar to:

use IO::Handle; my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ }); while (my $row = $csv->getline (*ARGV)) { my @fields = @$row;

or, if you open files yourself, like:

my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ }); open my $io, "<", $file or die "$file: $!"; while (my $row = $csv->getline ($io)) { my @fields = @$row;

I'm still thinking about the best way to add this to the docs.


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: The future of Text::CSV_XS - TODO
by tfrayner (Curate) on May 30, 2007 at 18:51 UTC
    Thanks very much for looking at this. I've tried 0.27, and it shows the same problem (see my original node, updated, for a test of sorts). I looked through your eol tests, and checked on exactly what was being written out to the temporary file in the \r cases. It appears that in those cases the file is terminated with a \r\n, rather than just \r. I think this may be why your tests pass but mine doesn't?

    Cheers,

    Tim

      New snapshot just uploaded, in which eol => $/ is permitted for "\r". That extends successful parsing to line endings in the set undef, "\n", "\r\n", and "\r".


      Enjoy, Have FUN! H.Merijn
        Thanks, that's a huge improvement. I'm afraid I've discovered another slight wrinkle, though:
        use strict; use Text::CSV_XS; use IO::File; $/ = "\r"; my $f = IO::File->new_tmpfile; print $f ('a,b,c', $/, '"d","e","f"', $/); seek($f,0,0); my $c = Text::CSV_XS->new({ eol => $/ }); for(0..1){ print join("|",@{ $c->getline($f) })."\n" }
        The first getline works, but the second fails. It looks as though the quote characters are blocking recognition of \r as eol (again, the code here works if $/="\n").

        Cheers,

        Tim

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://618129]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (7)
As of 2024-04-25 15:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found