http://qs321.pair.com?node_id=246265

choeppner has asked for the wisdom of the Perl Monks concerning the following question:

Oh Wise Monks,

I am using the Text::CSV_XS module to parse a record from an external system. I think that I might have discovered a bug in the module, but wanted to check here first, to see if I am doing something wrong.

The suspected bug occurs when I pass the following string into the module.

0005:A:A2:\\\\string 04\\\\

The fields are delimited with ':' and the escape character is '\'.

I setup for parsing this record using the following code.

use Text::CSV_XS; my $csv = Text::CSV_XS->new({'quote_char' => '', 'escape_char' => '\\', 'sep_char' => ':', 'binary' => 1 }); if($csv->parse($Row)) { my @fields = $csv->fields(); print "$fields[3]\n"; }
The results in $field[3] is \string 04\\, but I was expecting to get \\string 04\\.

TIA for your thoughts.

Charles
(The quiet Monk)

Replies are listed 'Best First'.
Re: Text::CSV_XS -- Bug or wrong usage?
by BrowserUk (Patriarch) on Mar 27, 2003 at 16:29 UTC

    I'd have to agree with you that this seems like a bug in Text::CSV_XS. Maybe an email to the authors CPAN address is called for.

    #! perl -slw use strict; use Text::CSV_XS; while(my $Row =<DATA>) { my $csv = Text::CSV_XS->new({ 'quote_char' => '', 'escape_char' => chr(92), 'sep_char' => ':', 'binary' => 1 }); if($csv->parse($Row)) { my @fields = $csv->fields(); print "@fields"; } } __DATA__ 0005:A:A2:\\\\string 04\\\\ 0005:A:A2:\\\string 04\\\\ 0005:A:A2:\\string 04\\\\ 0005:A:A2:\string 04\\\\

    Output

    C:\test>246265 0005 A A2 \string 04\\ 0005 A A2 \string 04\\ 0005 A A2 string 04\\ 0005 A A2 string 04\\

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Text::CSV_XS -- Bug or wrong usage?
by jmcnamara (Monsignor) on Mar 27, 2003 at 17:02 UTC

    This definitely looks like a bug.

    You should report it but in the meantime you can workaround it by omitting the escape character and doing your own escaping. The following example replaces "\:" with $; (char 032) before parsing and then converts it back:

    #!/usr/bin/perl -wl use strict; use Text::CSV_XS; my $row = '0005:A:A2:\\\\string 04\\\\'; # Translate escaped : "\:" $row =~ s/\\:/$;/eg; my $csv = Text::CSV_XS->new({'quote_char' => '', 'escape_char' => '', 'sep_char' => ':', 'binary' => 1 }); if ($csv->parse($row)) { my @fields = $csv->fields(); # Replace escaped chars s/$;/\\:/g for @fields; print "$fields[3]\n"; } else { print "Parse failed ", $csv->error_input(); }

    --
    John.

      Thanks for the suggestion.

      I have sent an email to the listed author.

      The author has indicated that the modules current maintainer is Jeff Zucker, jzucker@cpan.org.

      Charles
      (The quiet Monk)

(jeffa) Re: Text::CSV_XS -- Bug or wrong usage?
by jeffa (Bishop) on Mar 28, 2003 at 17:06 UTC
    I thought i would post an example of tilly's Text::xSV, but i am not sure that i am actually escaping the slashes like you want. Oh well, i'll post it anyway in hopes that someone can help correct it if it is indeed incorrect. ;)
    use strict; use warnings; use Text::xSV; my $csv = Text::xSV->new( sep => ':', fh => *DATA, filter => sub {($_ = shift) =~ s/\\{2}/\\/g;$_} ); $csv->bind_fields(qw(foo bar baz qux)); while ($csv->get_row()) { print $csv->extract('qux'), "\n"; } __DATA__ 0005:A:A2:\\\\string 04\\\\ 0005:A:A2:\\\string 04\\\\ 0005:A:A2:\\string 04\\\\ 0005:A:A2:\string 04\\\\
    When run, this script outputs:
    
    \\string 04\\
    \\string 04\\
    \string 04\\
    \string 04\\
    

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
      This looks good to me...
      I will give it a try.

      Thanks for the response.

      Charles
      (The quiet Monk)

Re: Text::CSV_XS -- Bug or wrong usage?
by diotalevi (Canon) on Mar 27, 2003 at 17:43 UTC

      Could you detail the other Text::CSV_XS bugs you know about please?

      I was just beginning to get comfortable with using this module rather than regexes for parsing CSV data given its superior performance over the modules available when I saw the OP's bug. It's a pain, but most of the stuff I am playing with doesn't use escapes never mind escaped escapes so I wasn't too bothered, but if there are more bugs, it would be nice to know what they are.

      It would also be useful to know if a) you have reported them to the author. b) what sort of response you recieved if any.


      Examine what is said, not who speaks.
      1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
      2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
      3) Any sufficiently advanced technology is indistinguishable from magic.
      Arthur C. Clarke.

        I tried Text::CSV_XS again and had zero problems. All I can figure is that maybe somewhere along the line it had something to do with seemingly escaped metacharacters. So things like "\\\t" (slash followed by a tab) might have confused it. I do recall that at the time, identical input was in some cases handled by Text::CSV and not handled by Text::CSV_XS though I can't seem to find what the failure mode is just now.

      For what definition of works? Text::CSV is unable to handle returns embedded in fields. See Text::xSV instead.

        My operational definition of works for CSV data has never had to include records with internal newlines. Also, a test on one batch of sample data just works now so either I was doing something incorrect in the past or this sample is somehow more correct than the previous samples. The second scenario is entirely possible since my current data only covers a few portions of Minneapolis while previous data has been constrained to Hennepin county (Minneapolis and some suburbs), just Minneapolis or all of Minnesota.

        So... Text::CSV_XS is working flawlessly right now for me though past experience has had me switch to the plain-perl Text::CSV.

        And as an additional plus, it was written by our missing monk. And I can state from personal experience using it, that it is a well put together module and that it works.

        TStanley
        --------
        It is God's job to forgive Osama Bin Laden. It is our job to arrange the meeting -- General Norman Schwartzkopf