Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: Help parsing a complicated csv

by rmfin730 (Initiate)
on May 16, 2011 at 14:23 UTC ( [id://905085]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Help parsing a complicated csv
in thread Help parsing a complicated csv

I may have spoke to soon. It appears that this is pulling my data and structuring it correctly when I output to a txt file.

However I need to be able to pull the specific columns and im not sure how to do that. I want to be able to do something like:

print hash->{header_name} and it will give me all of the keys under that column.

So again, I have columns in a csv file that are kind of "stacked" on top of each other, meaning there is not a single header row at the top of the file, the header for each column is on different lines of the file.

The headers are always enclosed in <>, so I want to scan through my csv pull out the headers and then put the corresponding values in that particular column into a hash that I can read out by doing something like hash->{header} this would give me all of the values in the column.

Sometimes there are 100 rows under a header and sometimes there is just 1. Thanks again for your help, sorry if this doesnt make sense, it is kind of confusing.

Let me try to explain again what the csv looks like...

<header>, <header>, <header> value, value, value value, value value, value, <header>, <header> value, value value, value value,

This goes on like this randomly through out the csv file, I hope that this little picture makes some more sense. I think that we are on the right track but not here yet! Thanks again for everyones help!

Replies are listed 'Best First'.
Re^4: Help parsing a complicated csv
by linuxer (Curate) on Jul 05, 2011 at 21:32 UTC

    Hi,

    I just read this thread again and saw your reply.

    Assuming, that an empty string is not a valid value, I came up with this:

    #! /usr/bin/perl use strict; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new({ binary => 1, allow_whitespace => 1, }) or die "Cannot use CSV: " . Text::CSV_XS->error_diag(); # for testing; in real world, open file and use that handle my $fh = \*DATA; my (%hash, @hdr); while ( my $row = $csv->getline( $fh ) ) { # header not yet defined? or 1st cell starts with '<' ==> use row +as header if ( !@hdr || $row->[0] =~ m/^</ ) { @hdr = @{$row}; next; } # otherwise try to process data else { for my $i ( 0 .. $#hdr ) { # only add those values which contain at least one charact +er # so: no "undef"s or empty strings in result # if empty strings are OK or wanted, try to replace length +() with defined() push @{ $hash{$hdr[$i]} }, ( length $row->[$i] ? $row->[$i +] : () ); } } } # check created data structure require Data::Dumper; $Data::Dumper::Sortkeys = 1; print Data::Dumper::Dumper( \%hash ); __DATA__ <A1>, <A2>, <A3> a1, aa1, aaa1 a2, , aaa2 a3, aa3 a4 <B1>, <B2> b1, bb1 b2, bb2 b3
    That produced a result like this:
    $VAR1 = { '<A1>' => [ 'a1', 'a2', 'a3', 'a4' ], '<A2>' => [ 'aa1', 'aa3' ], '<A3>' => [ 'aaa1', 'aaa2' ], '<B1>' => [ 'b1', 'b2', 'b3' ], '<B2>' => [ 'bb1', 'bb2' ] };

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://905085]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-25 20:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found