http://qs321.pair.com?node_id=479084

Tatnall has asked for the wisdom of the Perl Monks concerning the following question:

Updated below: I've got over 1 million records to go through as I normalize tables, which means I need to use DBI's fetchall_arrayref $max_size to keep memory usage in check unless there is a better way.

In my loop, I'm getting this error: Can't coerce array into hash at line 44. and can't figure out why since the same code worked minus the while statement for $max_size.

# Assuming: # # Table1 # +-----+-----------+ # | id | Name | # +-----+-----------+ # | 1 | Bill | # +-----+-----------+ # | 2 | Bob | # +-----+-----------+ # | 3 | Mike | # +-----+-----------+ # # Table2 before # +-----+-----------+ # | id | Name | # +-----+-----------+ # | ? | Mike | # +-----+-----------+ # | ? | Bob | # +-----+-----------+ # | ? | Bill | # +-----+-----------+ # # Table2 after # +-----+-----------+ # | id | Name | # +-----+-----------+ # | ? | 3 | # +-----+-----------+ # | ? | 2 | # +-----+-----------+ # | ? | 1 | # +-----+-----------+
#!/usr/local/bin/perl -w # print "Content-type: text/html\n\n"; use lib "/home/champ1/public_html/cgi-bin"; use warnings; use CGI::Carp qw(fatalsToBrowser); use strict; use CGI qw/:cgi/; use Data::Dumper; use Common; use Validate; use HTML::Template; use HTML::FillInForm; $|=1; #------------------------- declare -------------------------- my $query = new CGI; my ($sth, $stmt, $template, $hdrflg, $fvalues); #$hdrflg = + is set when a header is first written my ($dbh, $keylock, $basehref) = dbconnect(); #f $stmt = "SELECT id, name FROM table1"; $sth = execute_it($dbh,$stmt); my $table1 = $sth->fetchall_arrayref({}); my $max_rows = 2; my ($i, $j); $stmt = "SELECT id, name FROM table2"; $sth = $dbh->prepare($stmt); $sth->execute(); while (my $table2 = $sth->fetchall_arrayref(undef, $max_rows)) { for $i ( 0 .. $#$table2 ) { for $j ( 0 .. $#$table1 ) { ###line 44### if ( $table2->[$i]{'name'} eq $table1->[$j]{'name'} ) { $stmt = ("UPDATE table2 SET name = ? WHERE id = '$ +table2->[$i]{'id'}'"); $sth = $dbh->prepare($stmt); $sth->execute ( $table1->[$j]{'id'}); } } } }; close_db(); exit();

I used the super search and found one node, which wasn't very helpful. http://www.perlmonks.com/index.pl?node_id=320547

Update:

Thanks to gmax here is the code I ended up using.

It's currently running through the rows. :)

$stmt = "SELECT id, name FROM table1"; $sth = execute_it($dbh,$stmt); my $table1 = $sth->fetchall_arrayref({}); my ($i, $j, $sth2); $stmt = "SELECT id, name FROM table2"; $sth = $dbh->prepare($stmt); $sth->execute(); print "Content-type: text/plain \n\n"; my $max_rows = 500; my $rows = []; # cache for batches of rows while ( my $row = ( shift( @$rows ) || shift( @{$rows=$sth->fe +tchall_arrayref({},$max_rows) || []} ) ) ) { for ( $row ) { for $j ( 0 .. $#$table1 ) { if ( $row->{'name'} eq $table1->[$j]{'name'} ) { $stmt = qq/UPDATE table2 SET name = ? WHERE id = " +$row->{id}"/; print "Here.$j".Dumper ($stmt)."\n"; $sth2 = $dbh->prepare($stmt); $sth2->execute ( $table1->[$j]{'id'}); } } } }; close_db(); exit();

READMORE tags added by Arunbear

Replies are listed 'Best First'.
Re: DBI fetchall_arrayref using $max_size doesn't dereference normally.
by gmax (Abbot) on Jul 28, 2005 at 19:21 UTC

    If you call $sth->fetchall_arrayref(undef,$max_rows) like you do, then your result will be an array of arrays, not an array of hashes, like you would like.

    Moreover, your usage of $max_rows is not recommended. The docs state clearly that you should use a buffer to fetch records using that parameter.

    # example from DBI docs my $max_rows=10_000; my $rows = []; # cache for batches of rows while( my $row = ( shift(@$rows) || shift(@{$rows=$sth->fetchall_arrayref({},$max_rows) || []}) ) ) { ... # do something with $row (not $rows, which is only the cache) }

    Last, but not least, be aware that you are using the same statement handler for a outer and an inner loop. That is a mistake! The second "prepare" will reset $sth, and it will not continue fetching records. You must use two of them, one for fetching records, and one for updating.

    Please see:

    Update
    Before trusting your code with 1 million records, try it with some smaller data set, to ensure that it's doing what you need.

     _  _ _  _  
    (_|| | |(_|><
     _|   
    
      gmax,

      The code is currently at work, thank you. I'm sure there is an even better way than the way I modified the code per your suggestion since I don't understand all of what is happening or why.

      If you or someone else could unpack the while statement it could be greatly appreciated.
        The docs say that when there are no more rows, fetchall_arrayref returns a reference to an empty array. That would imply your while loop would never terminate (a reference is always true, even if it's a reference to an empty array). gmax's suggested code above fixes that.
Re: DBI fetchall_arrayref using $max_size doesn't dereference normally.
by runrig (Abbot) on Jul 28, 2005 at 19:07 UTC
    table2 is an reference to an array of arrays (you did not supply a hashref as a slice parameter to fetchall_arrayref). table1 is a reference to an array of hash refs because you did supply a hashref as the first argument to fetchall_arrayref.