Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: Advances SQL calculations in CSV file

by Roger (Parson)
on Aug 31, 2005 at 17:12 UTC ( #488153=note: print w/replies, xml ) Need Help??

in reply to Advances SQL calculations in CSV file

So what have you tried so far? Where is your Perl question? This post is just a "work request", it is not a genuine Perl question.

You can use split to split your record if you want to do it the easy way, or you can use Text::CSV_XS module to do a more capable splitting, or you can use DBD::CSV to read the CSV file via SQL. There are many many ways to do it.

  • Comment on Re: Advances SQL calculations in CSV file

Replies are listed 'Best First'.
Re^2: Advances SQL calculations in CSV file
by DrAxeman (Scribe) on Aug 31, 2005 at 17:53 UTC
    Sorry. I didn't mean for that to sound like a job request. I was more concerned with properly describing my needs that I was at formating my request for suggestions.

    I have other scripts that calculate averages for these columns, and I will include that code here. I would like to use DBD::CSV on this, but lack the SQL experience to properly script this with out looping over and over in the file.
    # Connect to the database, (the directory containing our csv file( +s)) my $dbh = DBI->connect("DBI:CSV:f_dir=.;csv_eol=\n;"); # Associate our csv file with the table name 'results' $dbh->{'csv_tables'}->{'results'} = { 'file' => "$CWD/$name"}; my $sth = $dbh->prepare("SELECT * FROM results WHERE 1=0"); $sth->execute; my @origcols = @{$sth->{NAME}}; my @cols; # = @origcols; shift @origcols; # Eliminate First Column foreach ( @origcols ) { push ( @cols , $_ ) unless /Bandwidth.*|MSTCPLoop.*/ ; # Remove + unwanted columns }; my $avgSQL = 'SELECT ' . join(', ', map { "avg($_) \n" } @cols ) . ' FROM results'; my @avgs = $dbh->selectrow_array($avgSQL); my %avgsHash; @avgsHash{ @cols } = @avgs; # uses a hash slice to populate %avgs +Hash for ( @cols ) { (my $changed = $_) =~ s/_/","/; printf STATS ("\"%s\",%f\n", $changed, $avgsHash{$_}); }

    Honestly, I have only made a couple of attempts at this. I'm really stumped on how to format the SQL query. I think that is someone could point me in the direction the SQL query should look like, I'd be able to create it.
      SQL is the unnecessary complexity you don't want in such a simple task. Besides DBD::CSV implements a very limited subset of SQL.

      What you should do is just to have a single pass over your file reading a line at the time, build your hash table of statistics along the way while splitting the records, and print out the stats at the end.

      my %stats; open MYFILE, "<data.txt" or die "Can not open file: $!"; chomp(my $heading = <MYFILE>); my @col = split /,/, $heading; while (my $line = <MYFILE>) { # build a hash for the input record chomp($line); my @rec = split /,/, $line; my %rec; @rec{@col} = @rec; # collect stats $stats{$rec{rec_id}}{BAL} += $rec{bal}; $stats{$rec{rec_id}}{NUM} ++; } # print the stats here for my $id (keys %rec) { if ($rec{$id}{NUM}) { print "$id => ", $rec{$id}{BAL}/$rec{$id}{NUM}, "\n" } } close MYFILE;
        I'm getting the following error, but everything looks initialized to me..

        Use of uninitialized value in hash element at ./ line 19, <MYFIL +E> line 994. Use of uninitialized value in addition (+) at ./ line 19, <MYFIL +E> line 994. Use of uninitialized value in hash element at ./ line 20, <MYFIL +E> line 994.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://488153]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2022-10-05 00:03 GMT
Find Nodes?
    Voting Booth?
    My preferred way to holiday/vacation is:

    Results (20 votes). Check out past polls.