http://qs321.pair.com?node_id=526823

The DBI docs say that fetchall_arrayref can get records in batches, as a compromise between fetching one-by-one and all-at-once.

The example from the manual goes like this.

my $rows = []; # cache for batches of rows while( my $row = ( shift(@$rows) || # get row from cache, or reload +cache: shift(@{$rows=$sth->fetchall_arrayref(undef,10_00 +0)||[]}) ) ) { ... }

This idiom makes sense, because it can fetch records at high speed, as shown in Speeding up the DBI.

A few days ago, brother bradcathey asked my help for a case that was similar to the idiom in the manual, with the addition of a slice, so that fetchall_arrayref produces a AoH instead of a AoA. This case led to an error, as shown in the following code.

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use DBI; my $driver = shift || 'mysql'; my $db = 'test'; my $user = undef; my $pass = undef; my $dbh; $driver = lc $driver; if ($driver eq 'mysql' ) { $dbh = DBI->connect('DBI:mysql:'.$db . ";mysql_read_default_file=$ENV{HOME}/.my.cnf" , $user, $pass, {RaiseError => 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::mysql: %s\n", $DBI::VERSION, $DBD::mysql::V +ERSION; } elsif ($driver eq 'sqlite') { $db = 'test.db'; if ( -f $db) { unlink $db; } $dbh = DBI->connect('DBI:SQLite:'.$db, $user, $pass, {RaiseError => + 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::SQLite: %s\n", $DBI::VERSION, $DBD::SQLite:: +VERSION; } else { die "driver $driver not supported in this test\n"; } my $max_fields = 3; my $field_size = 9; my $max_records = 10; my $max_commit = 5; my $text = 'abcdefghi'; my $inserted = 0; my $create_query = qq{CREATE TABLE testdbi ( } . (join q{,}, map { qq{id$_ char($field_size) not null} } ( 1..$ma +x_fields )) . qq{, primary key (id1))}; { local $dbh->{PrintError} = 0; eval { $dbh->do(qq{DROP TABLE testdbi}) }; } $dbh->do(qq{begin}); $dbh->do($create_query); my $sth = $dbh->prepare(qq{INSERT INTO testdbi VALUES (} . (join q{,}, map {q{?}} (1..$max_fields)) .q{)} ); for (1..$max_records) { $inserted += $sth->execute( map {$text++} (1..$max_fields) ); if (($inserted % $max_commit) == 0) { $dbh->do(qq{commit}); print "$inserted\n"; $dbh->do(qq{begin}); } } $dbh->do(qq{commit}); print "inserted $inserted records\n"; my $max_rows = 3; #must leave a remainder for the error #multiples of 2 and 5 work print "without slice\n"; my $rowcache = []; my $select_query = qq{SELECT id1, id2 FROM testdbi ORDER BY id1}; $sth = $dbh->prepare($select_query); $sth->execute; my $count = 0; while (my $aref = ( shift(@{ $rowcache } ) || shift (@{$rowcache=$sth->fetchall_arrayref(undef, $max_ro +ws) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; #---- HERE STARTS THE FAILING CODE ------ print "with slice\n"; $rowcache = []; $sth = $dbh->prepare($select_query); $sth->execute; $count = 0; while (my $aref = ( shift(@{ $rowcache }) || shift (@{$rowcache=$sth->fetchall_arrayref({}, $max_rows) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; __END__ $ perl test_dbi.pl mysql DBI: 1.49 - DBD::mysql: 3.0002 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' }; DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. $ perl test_dbi.pl sqlite DBI: 1.49 - DBD::SQLite: 1.09 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' };

As you can see from the sample output, the example fails when using both a slice and a row count with DBD::mysql, but it does not fail with DBD::SQLite. However, changing $max_records to 1,000, the script fails with SQLite2 (segmentation fault).

While initially I was thinking that perhaps there was a bug in DBD::mysql, this latter fact made me think that perhaps using fetchall_arrayref with a slice and row count together was never an intended behavior. Why? Because the row count device is intended for speed, while the slice (leading to a hashref per record) is intended for clarity, but has a speed penalty that does not agree with the row count option.

Thus, I advised bradcathey to use the recipe described at DBI recipes /binding a hash, i.e. using bind_col with a static hash. This way, there can be clarity without suffering too much speed penalty.

So, what do you think? It is a bug, or wrong usage?

 _  _ _  _  
(_|| | |(_|><
 _|   

Replies are listed 'Best First'.
Re: fetchall_arrayref with slice and row count
by blokhead (Monsignor) on Jan 31, 2006 at 19:04 UTC
    I found the same thing to be true in DBI::fetchall_arrayref() error. I guess you could say that the general consensus was that it was a bug. I think it would be nice to have the best of both worlds (slice + limit).

    blokhead

      Not a bug. So wrong usage.
      while ( $sth->{Active} and $rowcache=$sth->fetchall_arrayref({}, $max_rows) ) { ... }
      See the DBI documentation: the last possible fetch in a fetchall will call $sth->finish.