Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Text::CSV and escaped quotes

by mldvx4 (Friar)
on Sep 28, 2023 at 23:37 UTC ( [id://11154737] : perlquestion . print w/replies, xml ) Need Help??

mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:

Thanks, Tux, for Text:CSV. It keeps coming in handy both directly and indirectly.

I have a question about escaped quotes in incoming files. I would have expected the output for the three data lines below to be identical, but the second line misses the escaped single quote. Is an escaped single quote some kind of faux pax or even a mistake within the data or would it be one in something which Text::CSV draws on?

#!/usr/bin/perl use Text::CSV qw(csv); use strict; use warnings; my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 }); my $o = \*STDOUT; # Read/parse CSV while (my $row = $csv->getline (\*DATA)) { my @selected = ( splice(@{$row}, 0, 2), splice(@{$row}, -6, 2) ); $csv->say($o, \@selected); } exit(0); __DATA__ 8, 9, NULL, 'Filler', '555-999', '77:88', 0, 0, 0, 0 8, 9, NULL, 'Filler', '555-999', '77:88', 'A \' B , C', 0, 0, 0 8, 9, NULL, 'Filler', '555-999', '77:88', 0, 0, 0, 0

This is based on something found in the wild.

Replies are listed 'Best First'.
Re: Text::CSV and escaped quotes
by swl (Parson) on Sep 29, 2023 at 00:00 UTC

    Check the escape_char setting.

    Adding this line after the loop shows " is the escape_char.

    say $csv->escape_char;

    Getting it to import also needs a few other settings, so perhaps the auto_diag process needs work, or this is just difficult data to auto diagnose.

    I also re-arranged the code to work on my machine and added a use 5.010;.

    use strict; use warnings; use 5.010; use Text::CSV qw(csv); my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1, escape_char => '\\', quote_char => "'", allow_whitespace => 1 }); my $o = \*STDOUT; # Read/parse CSV while (my $row = $csv->getline (\*DATA)) { my @selected = ( splice(@{$row}, 0, 2), splice(@{$row}, -6, 2) ); $csv->say($o, \@selected); } say $csv->escape_char; exit(0); __DATA__ 8, 9, NULL, 'Filler', '555-999', '77:88', 0, 0, 0, 0 8, 9, NULL, 'Filler', '555-999', '77:88', 'A \' B , C', 0, 0, 0 8, 9, NULL, 'Filler', '555-999', '77:88', 0, 0, 0, 0
        The one I keep running into is where they don't bother escaping anything, and just hope the quotes work out.

        There's a setting for that too!

    A reply falls below the community's threshold of quality. You may see it by logging in.