Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Parsing a csv file with multiple data records in it (generated from Window's IOMeter)

by TASdvlper (Monk)
on May 25, 2005 at 19:34 UTC ( [id://460429]=perlquestion: print w/replies, xml ) Need Help??

TASdvlper has asked for the wisdom of the Perl Monks concerning the following question:

All,

This might be a little hard to explain, but I'll try anyway. I have a xls/csv file, that has multiple data records in it. The problem is the records are not grouped together. For example (where '-' is being used to signify a new column), (the numbers are obviously made up)

# - size - reads - writes - read IOPs - write IOPs 1 - 512 - 1000 - 0 - 2000 - 0 2 - 1024 - 0 - 1000 - 0 - 2000 3 - 2048 - 500 - 500 - 1000 - 1000 # - Time Stamp 1 - 2005-05-02 18:16:36:589 2 - 2005-05-03 18:16:36:589 3 - 2005-05-04 18:16:36:589 # - Comments 1 - This test failed 2 - This test passed 3 - This test needs more data.
So, in this case, there are 3 records that I want to import into my db (all the lines that start with 1, 2 and 3). But, as you can see, they are on different lines. Each column name (size, reads, writes, etc) are attributes in my database table.

So, my thought was to make a array of hash references where the array size would be how many records are in the file (because it may vary), and the hash ref would contain all the data for that record. So, my array would look something like:

$myarray[0] = { 'size' => '512', 'reads' => '1000', 'writes' => '0', 'read IOPs' => '2000', 'write IOPS' => '0', 'Time Stamp' => '2005-05-02 18:16:36:589', 'Comments' => 'Thie tst failed' }
I'm just not sure how to parse that file where the data is all on different lines. There is one thing, each line that starts with a "#" is the column name, so that might help in the parsing.

As always - thanks all.

Replies are listed 'Best First'.
Re: Parsing a csv file with multiple data records in it (generated from Window's IOMeter)
by holli (Abbot) on May 25, 2005 at 19:55 UTC
    use strict; use warnings; use Data::Dumper; my @fields; my @data; while (<DATA>) { chomp; if ( /^#/ ) { @fields = split / +- +/; shift @fields; next; } next if /^$/; my @values = split / +- +/; my $index = shift @values; for my $i (0..$#fields) { $data[$index-1]->{$fields[$i]} = $values[$i]; } } print Dumper (\@data); __DATA__ # - size - reads - writes - read IOPs - write IOPs 1 - 512 - 1000 - 0 - 2000 - 0 2 - 1024 - 0 - 1000 - 0 - 2000 3 - 2048 - 500 - 500 - 1000 - 1000 # - Time Stamp 1 - 2005-05-02 18:16:36:589 2 - 2005-05-03 18:16:36:589 3 - 2005-05-04 18:16:36:589 # - Comments 1 - This test failed 2 - This test passed 3 - This test needs more data.
    Output:
    $VAR1 = [ { 'Comments' => 'This test failed', 'read IOPs' => '2000', 'reads' => '1000', 'write IOPs' => '0', 'Time Stamp' => '2005-05-02 18:16:36:589', 'writes' => '0', 'size' => '512' }, { 'Comments' => 'This test passed', 'read IOPs' => '0', 'reads' => '0', 'write IOPs' => '2000', 'Time Stamp' => '2005-05-03 18:16:36:589', 'writes' => '1000', 'size' => '1024' }, { 'Comments' => 'This test needs more data.', 'read IOPs' => '1000', 'reads' => '500', 'write IOPs' => '1000', 'Time Stamp' => '2005-05-04 18:16:36:589', 'writes' => '500', 'size' => '2048' } ];


    holli, /regexed monk/

      for my $i (0..$#fields) { $data[$index-1]->{$fields[$i]} = $values[$i]; }

      I have slices in the brain:

      @{ $data[ $index-1 ] }{ @fields } = @values;

      the lowliest monk

Re: Parsing a csv file with multiple data records in it (generated from Window's IOMeter)
by shemp (Deacon) on May 25, 2005 at 20:02 UTC
    You've already got it 99% percent figured out!
    when you get to a line that starts with a '#', its a list of fields. Split it on the '-' and store the results, those are your field names.
    Then when you parse the lines that dont begin with a '#', also spilt them on the '-'. The data and the columns will match up by position. Then enter the results into your structure, i.e.:
    $myarray[$row_number]->{$column_name} = $data;
      The missing 1% is to skip empty lines, as shown in my code above.


      holli, /regexed monk/

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://460429]
Approved by holli
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-20 03:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found