Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Printing to file from Text::CSV_XS

by edimusrex (Monk)
on Jan 28, 2016 at 16:47 UTC ( [id://1153896]=perlquestion: print w/replies, xml ) Need Help??

edimusrex has asked for the wisdom of the Perl Monks concerning the following question:

I am having a little bit of an issue using the Text::CSV_XS module. I've only ever used it to read csv files but never to create. The issue I am having is that when I print the output to file everything gets dumped to 1 line. The entire string gets wrapped in quotes so it treats each row as a column instead of each individual value. Here is the code I am using. I am sure this is probably something simple to resolve but it's been driving me crazy for a few hours and I need to move on. Thanks in advance for your help.

#!/usr/bin/perl use warnings; use strict; use utf8; use HTML::Strip; use Text::CSV_XS qw(csv); my $csv = Text::CSV_XS->new({ sep_char => "\t" }); my $hs = HTML::Strip->new(); my $file = 'test.html'; my $out = 'out.csv'; open my $fh ,"<", $file or die "Failed to open $file!: $!\n"; open my $io ,">", $out or die "Failed to open $out!: $!\n"; my $flag = 0; while(my $line = <$fh>) { chomp $line; if ($line =~ /\<\/Table\>\<Pre\>/){ $flag = 1;} elsif ($line =~ /\<A Name\=Footnotes\>\<\/A\>/){ $flag = 0;} next if $line =~ /&rarr;/; if ($flag) { my @data = &cleanThis($line); $csv->print($io, \@data); } } close($fh); close($io); sub cleanThis { my $string = shift; my $clean_text = $hs->parse($string); if ($clean_text =~ /^.+(\d+\:.+?[M|K])$/){ $clean_text = "$clean_text\tNone listed"; } my ($asOf, $filer, $filing, $forOn, $docsSize, $agent) = $clean_te +xt =~ m/(\d+\/\d+\/\d+)\s+(.+)\s+?(\w.+)\s+(\d+\/\d+\/\d+).+?(\d+\:\d +.+?)\s+(.+)/; my @formated = ($asOf,$filer,$filing,$agent); foreach my $trim(@formated){ $trim =~ s/^\s+|\s+$//g; } return join("\t",@formated) }

Here is a sample out the output

"1/27/16 Advanced Series Trust 497K Prudential Moneymar..Inc" +"1/27/16 Advisors Series Trust 497K US Bancorp Fund Svcs LLC +""1/27/16 Advisors Series Trust 497K US Bancorp Fund Svcs LL +C""1/27/16 Advisors Series Trust 497K US Bancorp Fund Svcs L +LC""1/27/16 Ark ETF Trust 497K Vintage/FA""1/27/16 Ark ET +F Trust 497K Vintage/FA""1/27/16 Delaware Group Cash Reserve + 485BPOS DG3/FA""1/27/16 Federated Equity Income Fund Inc + N-CSR Federated Admin..Svcs/FA""1/27/16 Federated Inv Series F +unds Inc N-CSR Federated Admin..Svcs/FA""1/27/16 Fidelity Ad +visor Series I N-CSR Publishing Data...Inc/FA""1/27/16 Fidel +ity Commonwealth Trust N-CSR Publishing Data...Inc/FA""1/27/16 + Fidelity Court Street Trust N-CSR Publishing Data...Inc/FA"" +1/27/16 Fidelity Court Street Trust II N-CSR Publishing Data +...Inc/FA""1/27/16 Fidelity Financial Trust N-CSR Publishing + Data...Inc/FA""1/27/16 Fidelity MT Vernon Street Trust N-CSR + None listed""1/27/16 Fidelity Phillips Street Trust N-CSR +Fidelity Aberdeen St..Tr""1/27/16 Fidelity Rutland Square Trust II + 497K Fidelity Aberdeen St..Tr""1/27/16 Fidelity Rutland Squ +are Trust II 497K Fidelity MT Vernon S..Tr""1/27/16 Fidelity + Salem Street Trust N-CSR Publishing Data...Inc/FA""1/27/16 +John Hancock ETF Trust 497K Data Communique Inc./FA"

As you can see it's a tab separated line but the entire line gets treated as a single value


Update
Changing my subroutine to return (@formatted) did the trick. Thanks for the help

Replies are listed 'Best First'.
Re: Printing to file from Text::CSV_XS
by hippo (Bishop) on Jan 28, 2016 at 16:58 UTC

    If you specify an appropriate eol value in the constructor it should use that. eg. for Unix style line endings:

    my $csv = Text::CSV_XS->new({ sep_char => "\t", eol => "\n" });

    Update: The documentation is actually very clear on this:

    When not passed in a generating instance, records are not terminated at all, so it is probably wise to pass something you expect. A safe choice for eol on output is either $/ or \r\n

      That did work for the line issue yet it is still wrapping the entire line in quotes

        Your subroutine cleanthis joins all the fields into one. Return a list instead and let Text::CSV_XS handle the joining.

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

        That's because your pre-processing is only passing a single field to the print method. If you don't want the quotes, don't use them. See quote_char.

        I do encourage you to read the documentation. It's very comprehensive.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1153896]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-19 22:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found