SQLite vs CDB_File vs BerkeleyDB

Here's the results of some simple performance testing I did on CDB vs BDB (v3) vs SQLite.

The lines marked "SQLite 2" are with large-grained transactions (a commit at the end of the inserts), whereas the other SQLite is with a commit on every row inserted.

Benchmark: timing 30000 iterations of Insert Berkeley, Insert CDB, Ins
+ert SQLite, Insert SQLite 2...
Insert Berkeley: 10 wallclock secs ( 3.26 usr +  5.66 sys =  8.92 CPU)
Insert CDB:  0 wallclock secs ( 0.25 usr +  0.02 sys =  0.27 CPU)
            (warning: too few iterations for a reliable count)
Insert SQLite: 20 wallclock secs ( 9.61 usr +  7.96 sys = 17.57 CPU)
Insert SQLite 2:  6 wallclock secs ( 4.92 usr +  0.08 sys =  5.00 CPU)

Benchmark: timing 30000 iterations of Select Berkeley, Select CDB, Sel
+ect SQLite, Select SQLite 2...
Select Berkeley:  5 wallclock secs ( 2.21 usr +  2.66 sys =  4.87 CPU)
Select CDB:  2 wallclock secs ( 1.10 usr +  0.83 sys =  1.93 CPU)
Select SQLite:  5 wallclock secs ( 4.61 usr +  0.15 sys =  4.76 CPU)
Select SQLite 2:  8 wallclock secs ( 4.70 usr +  0.24 sys =  4.94 CPU)
[download]

Here's the benchmarking code (it's messy - code comments not welcome ;-)

use DBI;
use CDB_File;
use BerkeleyDB;
# use Benchmark   qw(timethese cmpthese);
use Benchmark   qw(timethese);

# do some setup
my $id_lit = 0;
my $id_lit2 = 0;
my $id_bdb = 0;
my $id_cdb = 0;
my $txt = 'a' x 100;

unlink("sqlite.test", "sqlite2.test", "bdb.test", "cdb.test");

my $dbh = DBI->connect('dbi:SQLite:dbname=sqlite.test','','',
               { AutoCommit => 1, RaiseError => 1 });
my $dbh2 = DBI->connect('dbi:SQLite:dbname=sqlite2.test','','',
               { AutoCommit => 0, RaiseError => 1 });
$dbh->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100))
+');
$dbh2->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100)
+)');
$dbh->do('PRAGMA default_synchronous = off');
my $sth_ins = $dbh->prepare('INSERT INTO test (id, foo) values (?, ?)'
+);
my $sth_sel = $dbh->prepare('SELECT foo FROM test where id=?');
my $sth2_ins = $dbh2->prepare('INSERT INTO test (id, foo) values (?, ?
+)');
my $sth2_sel = $dbh2->prepare('SELECT foo FROM test where id=?');

my %ch;
my $cdb = new CDB_File ('cdb.test', "cdb.test.$$") or die "Can't creat
+e";

my %bh;
my $bdb = tie %bh, 'BerkeleyDB::Hash',
                                    -Filename   => 'bdb.test',
                                    -Flags      => DB_CREATE,
                                    ;


sub insertLite {
    $id_lit++;
    $sth_ins->execute($id_lit, $txt);
}

sub insertLite2 {
    $id_lit2++;
    $sth2_ins->execute($id_lit2, $txt);
}

sub insertCDB {
    $id_cdb++;
    $cdb->insert($id_cdb, $txt);
}

sub insertBDB {
    $id_bdb++;
    $bh{$id_bdb} = $txt;
}

sub selectLite {
    $id_lit++;
    $sth_sel->execute($id_lit);
    1 while $sth_sel->fetch;
}

sub selectLite2 {
    $id_lit2++;
    $sth2_sel->execute($id_lit2);
    1 while $sth2_sel->fetch;
}

sub selectCDB {
    $id_cdb++;
    $ch{$id_cdb};
}

my $t1 = timethese(30_000,
                 {
                     'Insert SQLite'    => \&insertLite,
                     'Insert SQLite 2'  => \&insertLite2,
                     'Insert Berkeley'  => \&insertBDB,
                     'Insert CDB'       => \&insertCDB,
                 },
                 );

$dbh2->commit;
$cdb->finish;
$dbh->{AutoCommit} = 0;

$cdb = tie %ch, 'CDB_File', 'cdb.test' or die "tie failed: $!\n";

# cmpthese $t1;

$id_lit = 0;
$id_lit2 = 0;
$id_bdb = 0;
$id_cdb = 0;

my $t2 = timethese(30_000,
                 {
                     'Select SQLite'    => \&selectLite,
                     'Select SQLite 2'  => \&selectLite2,
                     'Select Berkeley'  => \&selectBDB,
                     'Select CDB'       => \&selectCDB,
                 },
                 );
# cmpthese $t2;
[download]

Comment on SQLite vs CDB_File vs BerkeleyDB Select or Download Code

Replies are listed 'Best First'.
New Benchmark: SQLite vs CDB_File vs BerkeleyDB by demerphq (Chancellor) on Mar 20, 2002 at 17:29 UTC
Hi. Well as I stated earlier I had serious reservations about the validity of your benchmark. When I messaged you about it you replied "The point was to use all the modules' simplest interfaces". This is a specious argument. There is no difference between the interface for a DB_HASH or a DB_BTREE. The difference is underneath at the implementation level. And the difference is huge. I reran your benchmark without CDB_File, it wont build cleanly on W32, and produced the following results which incidentally do not completely support my earlier coments ( the file size differences are not what I said they would be.) However they do show that your comparison of DB_HASH against DBD::SQLite is misleading (SQLite gets _creamed_ by a $DB_TREE), and they also show that if you are looking for high speed with a BerkeleyDB then you should definately go with a $DB_BTREE instead of a $DB_HASH. My results were as follows: Benchmark: timing 30000 iterations of I B+Tree, I B+Tree (A), I Hash, I Hash (A), I SQLite, I SQLite 2... I B+Tree: 1 wallclock secs ( 0.88 usr + 0.11 sys = 0.98 CPU) @ 30487.80/s (n=30000) I B+Tree (A): 1 wallclock secs ( 0.95 usr + 0.02 sys = 0.97 CPU) @ 30959.75/s (n=30000) I Hash: 8 wallclock secs ( 2.25 usr + 4.38 sys = 6.63 CPU) @ 4528.30/s (n=30000) I Hash (A): 7 wallclock secs ( 2.31 usr + 4.58 sys = 6.89 CPU) @ 4353.50/s (n=30000) I SQLite: 76 wallclock secs (14.84 usr + 49.66 sys = 64.50 CPU) @ 465.12/s (n=30000) I SQLite 2: 4 wallclock secs ( 3.30 usr + 0.09 sys = 3.39 CPU) @ 8849.56/s (n=30000) Rate I SQLite I Hash (A) I Hash I SQLite 2 I B+Tree I B+Tree (A) I SQLite 465/s -- -89% -90% -95% -98% -98% I Hash (A) 4354/s 836% -- -4% -51% -86% -86% I Hash 4528/s 874% 4% -- -49% -85% -85% I SQLite 2 8850/s 1803% 103% 95% -- -71% -71% I B+Tree 30488/s 6455% 600% 573% 245% -- -2% I B+Tree (A) 30960/s 6556% 611% 584% 250% 2% -- Benchmark: timing 30000 iterations of S B+Tree, S B+Tree(A), S Hash, S Hash(A), S SQLite, S SQLite 2... S B+Tree: 1 wallclock secs ( 0.77 usr + 0.09 sys = 0.86 CPU) @ 34965.03/s (n=30000) S B+Tree(A): 1 wallclock secs ( 0.95 usr + 0.05 sys = 1.00 CPU) @ 30000.00/s (n=30000) S Hash: 2 wallclock secs ( 1.03 usr + 1.63 sys = 2.66 CPU) @ 11295.18/s (n=30000) S Hash(A): 3 wallclock secs ( 1.47 usr + 1.36 sys = 2.83 CPU) @ 10608.20/s (n=30000) S SQLite: 11 wallclock secs ( 6.80 usr + 3.69 sys = 10.49 CPU) @ 2861.23/s (n=30000) S SQLite 2: 4 wallclock secs ( 3.89 usr + 0.09 sys = 3.98 CPU) @ 7530.12/s (n=30000) Rate S SQLite S SQLite 2 S Hash(A) S Hash S B+Tree(A) S B+Tree S SQLite 2861/s -- -62% -73% -75% -90% -92% S SQLite 2 7530/s 163% -- -29% -33% -75% -78% S Hash(A) 10608/s 271% 41% -- -6% -65% -70% S Hash 11295/s 295% 50% 6% -- -62% -68% S B+Tree(A) 30000/s 948% 298% 183% 166% -- -14% S B+Tree 34965/s 1122% 364% 230% 210% 17% -- Resulting File Sizes: bdb.hash.test :5226496 bdb.btree.test :6701056 bdb.hash.a.test :5095424 bdb.btree.a.test :3579904 sqlite.test :3845120 sqlite2.test :2043904 And the code I used to produce this benchmark is use DB_File; use DBI; use Benchmark qw(cmpthese); use strict; use warnings; # do some setup my $id_BDB_hash = 0; my $id_BDB_hash_a = 0; my $id_BDB_btree_a = 0; my $id_BDB_btree = 0; my $id_lit2 = 0; my $id_lit = 0; my $txt = 'a' x 100; unlink( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a.test", "sqlite.test", "sqlite2.test" ); my $dbh = DBI->connect( 'dbi:SQLite:dbname=sqlite.test', '', '', { AutoCommi +t => 1, RaiseError => 1 } ); my $dbh2 = DBI->connect( 'dbi:SQLite:dbname=sqlite2.test', '', '', { AutoComm +it => 0, RaiseError => 1 } ); $dbh->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100)) +'); $dbh2->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100) +)'); $dbh->do('PRAGMA default_synchronous = off'); my $sth_ins = $dbh->prepare('INSERT INTO test (id, foo) values (?, ?) +'); my $sth_sel = $dbh->prepare('SELECT foo FROM test where id=?'); my $sth2_ins = $dbh2->prepare('INSERT INTO test (id, foo) values (?, ? +)'); my $sth2_sel = $dbh2->prepare('SELECT foo FROM test where id=?'); tie my %bdb_btree_a, "DB_File", "bdb.btree.a.test", O_RDWR \| O_CREAT, +0640, $DB_BTREE or die "Cannot open file 'bdb.btree.a.test': $!\n"; tied(%bdb_btree_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ \|\| 0 +) } ); tied(%bdb_btree_a)->filter_store_key( sub { $_ = pack( "N", $_ \|\| 0 ) +} ); tie my %bdb_hash_a, "DB_File", "bdb.hash.a.test", O_RDWR \| O_CREAT, 06 +40, $DB_HASH or die "Cannot open file 'bdb.hash.a.test': $!\n"; tied(%bdb_hash_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ \|\| 0 ) + } ); tied(%bdb_hash_a)->filter_store_key( sub { $_ = pack( "N", $_ \|\| 0 ) } + ); tie my %bdb_hash, "DB_File", "bdb.hash.test", O_RDWR \| O_CREAT, 0640, +$DB_HASH or die "Cannot open file 'bdb.hash.test': $!\n"; tie my %bdb_btree, "DB_File", "bdb.btree.test", O_RDWR \| O_CREAT, 0640 +, $DB_BTREE or die "Cannot open file 'bdb.btree.test': $!\n"; sub insertLite { $sth_ins->execute( $id_lit++, $txt ); } sub insertLite2 { $sth2_ins->execute( $id_lit2++, $txt ); } sub insertBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ } = $txt; } sub insertBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ } = $txt; } sub insertBDB_btree { $bdb_btree{ $id_BDB_btree++ } = $txt; } sub insertBDB_hash { $bdb_hash{ $id_BDB_hash++ } = $txt; } sub selectLite { $sth_sel->execute( $id_lit++ ); 1 while $sth_sel->fetch; } sub selectLite2 { $sth2_sel->execute( $id_lit2++ ); 1 while $sth2_sel->fetch; } sub selectBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ }; } sub selectBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ }; } sub selectBDB_btree { $bdb_btree{ $id_BDB_btree++ }; } sub selectBDB_hash { $bdb_hash{ $id_BDB_hash++ }; } cmpthese( 30_000, { 'I SQLite' => \&insertLite, 'I SQLite 2' => \&insertLite2, 'I Hash' => \&insertBDB_hash, 'I B+Tree' => \&insertBDB_btree, 'I Hash (A)' => \&insertBDB_hash_a, 'I B+Tree (A)' => \&insertBDB_btree_a, }, ); $id_BDB_hash = 0; $id_BDB_hash_a = 0; $id_BDB_btree_a = 0; $id_BDB_btree = 0; $id_lit2 = 0; $id_lit = 0; cmpthese( 30_000, { 'S SQLite' => \&selectLite, 'S SQLite 2' => \&selectLite2, 'S Hash' => \&selectBDB_hash, 'S B+Tree' => \&selectBDB_btree, 'S Hash(A)' => \&selectBDB_hash_a, 'S B+Tree(A)' => \&selectBDB_btree_a, }, ); foreach my $file ( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a +.test", "sqlite.test", "sqlite2.test" ) { print "$file :" . ( -s $file ) . "\n"; } [download] As Matts said, please no comments about the benchmarking code, I know its could be a lot nicer. (Its only a benchmark... :-) Yves / DeMerphq -- Benchmarking things isn't as easy or as straightforward as many people seem to think....	[reply] [d/l]
Re: New Benchmark: SQLite vs CDB_File vs BerkeleyDB by Anonymous Monk on Jan 20, 2011 at 16:09 UTC
Hi, years later, SQLite is better, and there are some optimisations possible. Rate I SQLite I Hash (A) I Hash I SQLite 2 I B+Tree (A +) I B+Tree I SQLite 18927/s -- -50% -51% -78% -80 +% -80% I Hash (A) 37975/s 101% -- -3% -56% -60 +% -60% I Hash 38961/s 106% 3% -- -55% -59 +% -59% I SQLite 2 86957/s 359% 129% 123% -- -9 +% -9% I B+Tree (A) 95238/s 403% 151% 144% 10% - +- 0% I B+Tree 95238/s 403% 151% 144% 10% 0 +% -- Rate S SQLite S Hash(A) S Hash S B+Tree(A) S SQLite +2 S B+Tree S SQLite 32787/s -- -44% -52% -63% -64 +% -68% S Hash(A) 58252/s 78% -- -16% -35% -37 +% -44% S Hash 68966/s 110% 18% -- -23% -25 +% -33% S B+Tree(A) 89552/s 173% 54% 30% -- -3 +% -13% S SQLite 2 92308/s 182% 58% 34% 3% - +- -11% S B+Tree 103448/s 216% 78% 50% 16% 12 +% -- [download] Here your modified benchmark code: use DB_File; use DBI; use Benchmark qw(cmpthese); use strict; use warnings; # do some setup my $id_BDB_hash = 0; my $id_BDB_hash_a = 0; my $id_BDB_btree_a = 0; my $id_BDB_btree = 0; my $id_lit2 = 0; my $id_lit = 0; my $txt = 'a' x 100; unlink( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a.test", "sqlite.test", "sqlite2.test" ); my $dbh = DBI->connect( 'dbi:SQLite:dbname=sqlite.test', '', '', { A +utoCommit => 1, RaiseError => 1 } ); my $dbh2 = DBI->connect( 'dbi:SQLite:dbname=sqlite2.test', '', '', { A +utoCommit => 0, RaiseError => 1 } ); $dbh->do('PRAGMA page_size = 4096'); $dbh->do('PRAGMA synchronous = off'); $dbh->do('PRAGMA cache_size = 6000'); $dbh->do('PRAGMA journal_mode = MEMORY'); $dbh->do('PRAGMA temp_store = MEMORY'); $dbh->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100)) +'); $dbh2->do('PRAGMA page_size = 4096'); $dbh2->do('PRAGMA cache_size = 6000'); $dbh2->do('PRAGMA journal_mode = MEMORY'); $dbh2->do('PRAGMA temp_store = MEMORY'); $dbh2->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100) +)'); my $sth_ins = $dbh->prepare('INSERT INTO test (id, foo) values (?, ?) +'); my $sth_sel = $dbh->prepare('SELECT foo FROM test where id=?'); my $sth2_ins = $dbh2->prepare('INSERT INTO test (id, foo) values (?, ? +)'); my $sth2_sel = $dbh2->prepare('SELECT foo FROM test where id=?'); tie my %bdb_btree_a, "DB_File", "bdb.btree.a.test", O_RDWR \| O_CREAT, +0640, $DB_BTREE or die "Cannot open file 'bdb.btree.a.test': $!\n"; tied(%bdb_btree_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ \|\| 0 +) } ); tied(%bdb_btree_a)->filter_store_key( sub { $_ = pack( "N", $_ \|\| 0 ) +} ); tie my %bdb_hash_a, "DB_File", "bdb.hash.a.test", O_RDWR \| O_CREAT, 06 +40, $DB_HASH or die "Cannot open file 'bdb.hash.a.test': $!\n"; tied(%bdb_hash_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ \|\| 0 ) + } ); tied(%bdb_hash_a)->filter_store_key( sub { $_ = pack( "N", $_ \|\| 0 ) } + ); tie my %bdb_hash, "DB_File", "bdb.hash.test", O_RDWR \| O_CREAT, 0640, +$DB_HASH or die "Cannot open file 'bdb.hash.test': $!\n"; tie my %bdb_btree, "DB_File", "bdb.btree.test", O_RDWR \| O_CREAT, 0640 +, $DB_BTREE or die "Cannot open file 'bdb.btree.test': $!\n"; sub insertLite { $sth_ins->execute( $id_lit++, $txt ); } sub insertLite2 { $sth2_ins->execute( $id_lit2++, $txt ); } sub insertBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ } = $txt; } sub insertBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ } = $txt; } sub insertBDB_btree { $bdb_btree{ $id_BDB_btree++ } = $txt; } sub insertBDB_hash { $bdb_hash{ $id_BDB_hash++ } = $txt; } sub selectLite { $dbh->selectall_arrayref($sth_sel, undef, $id_lit2++ ) } sub selectLite2 { $dbh2->selectall_arrayref($sth2_sel, undef, $id_lit2++ ) } sub selectBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ }; } sub selectBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ }; } sub selectBDB_btree { $bdb_btree{ $id_BDB_btree++ }; } sub selectBDB_hash { $bdb_hash{ $id_BDB_hash++ }; } cmpthese( 60_000, { 'I SQLite' => \&insertLite, 'I SQLite 2' => \&insertLite2, 'I Hash' => \&insertBDB_hash, 'I B+Tree' => \&insertBDB_btree, 'I Hash (A)' => \&insertBDB_hash_a, 'I B+Tree (A)' => \&insertBDB_btree_a, }, ); $dbh2->commit; $id_BDB_hash = 0; $id_BDB_hash_a = 0; $id_BDB_btree_a = 0; $id_BDB_btree = 0; $id_lit2 = 0; $id_lit = 0; cmpthese( 60_000, { 'S SQLite' => \&selectLite, 'S SQLite 2' => \&selectLite2, 'S Hash' => \&selectBDB_hash, 'S B+Tree' => \&selectBDB_btree, 'S Hash(A)' => \&selectBDB_hash_a, 'S B+Tree(A)' => \&selectBDB_btree_a, }, ); foreach my $file ( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a +.test", "sqlite.test", "sqlite2.test" ) { print "$file :" . ( -s $file ) . "\n"; } [download]	[reply] [d/l] [select]
Re^2: New Benchmark: SQLite vs CDB_File vs BerkeleyDB by Anonymous Monk on Jan 20, 2011 at 18:10 UTC
against which version of libdb?	[reply]
Re^3: New Benchmark: SQLite vs CDB_File vs BerkeleyDB by Anonymous Monk on Oct 26, 2011 at 11:50 UTC
Re: SQLite vs CDB_File vs BerkeleyDB by perrin (Chancellor) on Mar 19, 2002 at 17:21 UTC
Interesting, thanks for the benchmark. Strong showing from CDB. I'm about to be doing some similar benchmarking on a broader range of stuff (Cache::Cache, Cache::Mmap, etc.) with a focus on multi-process use (with mod_perl), and I will definitely include SQLite after seeing this. I don't know about CDB, since it doesn't allow updates and that makes it basically useless for storing state data. I think you hurt BDB performance a bit by using the tied interface. I never use it, since it's slower than calling the methods directly.	[reply]
Re: Re: SQLite vs CDB_File vs BerkeleyDB by Matts (Deacon) on Mar 19, 2002 at 19:21 UTC
We did consider not using the tied interface, but wanted to make sure we tested what we would actually be using at work, and I couldn't see people liking the non-tied API. Maybe I'll add that, but it's fairly obvious that it'll only speed things up. The read-only problem of CDB is frustrating. Though the speed of regenerating the entire DB seems pretty fast, so perhaps it's not so bad for our uses (where we'd be doing rebuilds in a cron job).	[reply]
Re: Re: SQLite vs CDB_File vs BerkeleyDB by princepawn (Parson) on Mar 19, 2002 at 18:44 UTC
Will there be any mention of Alzabo's Sync and Store classes?	[reply]
Re: Re: Re: SQLite vs CDB_File vs BerkeleyDB by perrin (Chancellor) on Mar 19, 2002 at 18:57 UTC
Probably, if time permits. There are a lot of options to cover.	[reply]
Re: SQLite vs CDB_File vs BerkeleyDB by dws (Chancellor) on Mar 19, 2002 at 17:21 UTC
Excellent! One question. Why `$dbh->do('PRAGMA default_synchronous = off');` in the default (commit after each insert) case? Is this not something that could be hidden beneath the options on the connect?	[reply] [d/l]
Re: Re: SQLite vs CDB_File vs BerkeleyDB by Matts (Deacon) on Mar 19, 2002 at 19:45 UTC
Otherwise SQLite does a fflush on every row, making it much slower ;-) The option is only relevant if you're worried about data integrity during a filesystem crash.	[reply]
Re: Re: Re: SQLite vs CDB_File vs BerkeleyDB by dws (Chancellor) on Mar 19, 2002 at 19:50 UTC
Otherwise SQLite does a fflush on every row, making it much slower ;-) Uh... That sounds like you're defeating `AutoCommit => 1`, which would make this a misleading benchmark.	[reply] [d/l]
Re: Re: Re: Re: SQLite vs CDB_File vs BerkeleyDB by Matts (Deacon) on Mar 19, 2002 at 20:51 UTC
Re: Re: Re: Re: Re: SQLite vs CDB_File vs BerkeleyDB by dws (Chancellor) on Mar 19, 2002 at 21:12 UTC
Some notes below your chosen depth have not been shown here
Re: SQLite vs CDB_File vs BerkeleyDB by demerphq (Chancellor) on Mar 20, 2002 at 13:02 UTC
Interesting. Thanks. But I have a reservation: The BerkeleyDB example I feel may be misleading. I base this only on a limited experience with DB_File which is an alternate interface to BerkeleyDB. My problem is this: The DBI examples are using an integer primary key (which would almost certainly be stored internally as a B+Tree). The BDB examples are using a STRING key (despite appearances) and are running in $DB_HASH mode. This is not a fair test. First of all the $DB_HASH representation is signifigantly slower and produces much larger files than the equivelent $DB_BTREE (especially under the circumstance your benchmark presents, ie keys being added inorder). Second of all even if you convert your benchmark to use $DB_BTREE you will also need to either a) supply your own inorder function so that records are inserted in the correct numeric order and not the lexicographical order that they would be inserted in now or b) supply your own key accessors (my preferred method) that use a pack function. This means the keys are stored in a compact form but also provide numeric ordering EG: `tied(%db_btree)->filter_fetch_key ( sub { $_ = unpack("N", $_ \|\| 0 ) +} ) ; tied(%db_btree)->filter_store_key ( sub { $_ = pack ("N", $_ \|\| 0 ) } + ) ;` [download] I wont promise, but im pretty sure your results will change quite a bit with the BDB once you use B+Trees and not the hashing mechanism. Either way I would consider the above issues enough to invalidate the results of this benchmark. OTOH, the benchmark is indeed interesting and I look forward to you either posting updated results, or Ill do it when I get time (could be a while though) Incidentally this illustrates why SQL is so nice. It hides away representation issues behind an engine that usually will make the correct decision transparently. Whereas using something like BerkeleyDB requires an understanding of the basic data storage algorithms and their domain applicability. Something that we dont all know about. (Until it bites us.) Yves / DeMerphq -- This rent for space.	[reply] [d/l]

Back to Meditations