fetchall_arrayref with slice and row count

The DBI docs say that fetchall_arrayref can get records in batches, as a compromise between fetching one-by-one and all-at-once.

The example from the manual goes like this.

  my $rows = []; # cache for batches of rows
  while( my $row = ( shift(@$rows) || # get row from cache, or reload 
+cache:
                     shift(@{$rows=$sth->fetchall_arrayref(undef,10_00
+0)||[]}) )
  ) {
    ...
  }
[download]

This idiom makes sense, because it can fetch records at high speed, as shown in Speeding up the DBI.

A few days ago, brother bradcathey asked my help for a case that was similar to the idiom in the manual, with the addition of a slice, so that fetchall_arrayref produces a AoH instead of a AoA. This case led to an error, as shown in the following code.

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use DBI;

my $driver = shift || 'mysql';
my $db   = 'test';
my $user = undef;
my $pass = undef;
my $dbh;
$driver = lc $driver;

if ($driver eq 'mysql' ) {

   $dbh = DBI->connect('DBI:mysql:'.$db
                  . ";mysql_read_default_file=$ENV{HOME}/.my.cnf"
              , $user, $pass, {RaiseError => 1})
              or die "Can't connect: $DBI::errstr\n";
    printf "DBI: %s - DBD::mysql: %s\n", $DBI::VERSION, $DBD::mysql::V
+ERSION;

}
elsif ($driver eq 'sqlite') {

   $db   = 'test.db';
   if ( -f $db) {
        unlink $db;
   }
   $dbh = DBI->connect('DBI:SQLite:'.$db, $user, $pass, {RaiseError =>
+ 1})
              or die "Can't connect: $DBI::errstr\n";
   printf "DBI: %s - DBD::SQLite: %s\n", $DBI::VERSION, $DBD::SQLite::
+VERSION;
}
else {
    die "driver $driver not supported in this test\n";
}

   my $max_fields  = 3;
   my $field_size  = 9;
   my $max_records = 10;
   my $max_commit  = 5;
   my $text        = 'abcdefghi';
   my $inserted    = 0;

   my $create_query = qq{CREATE TABLE testdbi ( }
    . (join q{,}, map { qq{id$_ char($field_size) not null} } ( 1..$ma
+x_fields ))
    . qq{, primary key (id1))};

   {
        local $dbh->{PrintError} = 0;
        eval { $dbh->do(qq{DROP TABLE testdbi}) };
   }
   $dbh->do(qq{begin});
   $dbh->do($create_query);

   my $sth = $dbh->prepare(qq{INSERT INTO testdbi VALUES (}
       . (join q{,}, map {q{?}} (1..$max_fields))
       .q{)} );

   for (1..$max_records) {
       $inserted += $sth->execute( map {$text++} (1..$max_fields) );
       if (($inserted % $max_commit) == 0) {
           $dbh->do(qq{commit});
           print "$inserted\n";
           $dbh->do(qq{begin});
       }
   }
   $dbh->do(qq{commit});
   print "inserted $inserted records\n";


   my $max_rows = 3; #must leave a remainder for the error
                     #multiples of 2 and 5 work
   print "without slice\n";
   my $rowcache = [];

   my $select_query = qq{SELECT id1, id2 FROM testdbi ORDER BY id1};
   $sth = $dbh->prepare($select_query);
   $sth->execute;

   my $count = 0;
   while (my $aref = ( shift(@{ $rowcache } )
          || shift (@{$rowcache=$sth->fetchall_arrayref(undef, $max_ro
+ws)
          || [] } ) ) )
   {
      my $rec =  Data::Dumper->Dump([$aref],['rec']);
      $rec =~ s/\s+/ /g;
      $count++;
      print "$count $rec \n";
   };

#---- HERE STARTS THE FAILING CODE ------
   print "with slice\n";

   $rowcache = [];
   $sth = $dbh->prepare($select_query);
   $sth->execute;

   $count = 0;
   while (my $aref = ( shift(@{ $rowcache })
          || shift (@{$rowcache=$sth->fetchall_arrayref({}, $max_rows)
          || [] } ) ) )
   {
      my $rec =  Data::Dumper->Dump([$aref],['rec']);
      $rec =~ s/\s+/ /g;
      $count++;
      print "$count $rec \n";
   };

__END__
$ perl test_dbi.pl mysql
DBI: 1.49 - DBD::mysql: 3.0002
5
10
inserted 10 records
without slice
1 $rec = [ 'abcdefghi', 'abcdefghj' ];
2 $rec = [ 'abcdefghl', 'abcdefghm' ];
3 $rec = [ 'abcdefgho', 'abcdefghp' ];
4 $rec = [ 'abcdefghr', 'abcdefghs' ];
5 $rec = [ 'abcdefghu', 'abcdefghv' ];
6 $rec = [ 'abcdefghx', 'abcdefghy' ];
7 $rec = [ 'abcdefgia', 'abcdefgib' ];
8 $rec = [ 'abcdefgid', 'abcdefgie' ];
9 $rec = [ 'abcdefgig', 'abcdefgih' ];
10 $rec = [ 'abcdefgij', 'abcdefgik' ];
with slice
1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' };
2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' };
3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' };
4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' };
5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' };
6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' };
7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' };
8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' };
9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' };
10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' };
DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at 
+test_dbi.pl line 106.
DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at 
+test_dbi.pl line 106.

$ perl test_dbi.pl sqlite
DBI: 1.49 - DBD::SQLite: 1.09
5
10
inserted 10 records
without slice
1 $rec = [ 'abcdefghi', 'abcdefghj' ];
2 $rec = [ 'abcdefghl', 'abcdefghm' ];
3 $rec = [ 'abcdefgho', 'abcdefghp' ];
4 $rec = [ 'abcdefghr', 'abcdefghs' ];
5 $rec = [ 'abcdefghu', 'abcdefghv' ];
6 $rec = [ 'abcdefghx', 'abcdefghy' ];
7 $rec = [ 'abcdefgia', 'abcdefgib' ];
8 $rec = [ 'abcdefgid', 'abcdefgie' ];
9 $rec = [ 'abcdefgig', 'abcdefgih' ];
10 $rec = [ 'abcdefgij', 'abcdefgik' ];
with slice
1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' };
2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' };
3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' };
4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' };
5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' };
6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' };
7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' };
8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' };
9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' };
10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' };
[download]

As you can see from the sample output, the example fails when using both a slice and a row count with DBD::mysql, but it does not fail with DBD::SQLite. However, changing $max_records to 1,000, the script fails with SQLite2 (segmentation fault).

While initially I was thinking that perhaps there was a bug in DBD::mysql, this latter fact made me think that perhaps using fetchall_arrayref with a slice and row count together was never an intended behavior. Why? Because the row count device is intended for speed, while the slice (leading to a hashref per record) is intended for clarity, but has a speed penalty that does not agree with the row count option.

Thus, I advised bradcathey to use the recipe described at DBI recipes /binding a hash, i.e. using bind_col with a static hash. This way, there can be clarity without suffering too much speed penalty.

So, what do you think? It is a bug, or wrong usage?

 _  _ _  _  
(_|| | |(_|><
 _|

Comment on fetchall_arrayref with slice and row count Select or Download Code

Replies are listed 'Best First'.
Re: fetchall_arrayref with slice and row count by blokhead (Monsignor) on Jan 31, 2006 at 19:04 UTC
I found the same thing to be true in DBI::fetchall_arrayref() error. I guess you could say that the general consensus was that it was a bug. I think it would be nice to have the best of both worlds (slice + limit). blokhead	[reply]
Re^2: fetchall_arrayref with slice and row count by druud (Sexton) on Nov 28, 2009 at 17:11 UTC
Not a bug. So wrong usage. `while ( $sth->{Active} and $rowcache=$sth->fetchall_arrayref({}, $max_rows) ) { ... }` [download] See the DBI documentation: the last possible fetch in a fetchall will call `$sth->finish`.	[reply] [d/l] [select]

Back to Meditations