Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Delay when write to large number of file

by thargas (Deacon)
on Jun 24, 2014 at 11:36 UTC ( [id://1091041]=note: print w/replies, xml ) Need Help??


in reply to Re: Delay when write to large number of file
in thread best way to fast write to large number of files

Since open/close is too slow, perhaps you might try a database, like sqlite?

I made a small test program and the output seems to indicate that it would be significantly faster:

C:\> perl trymany.pl connecting to dbi:SQLite:db.sqlite3 ... connected to dbi:SQLite:db.sqlite3 ready to begin Rate openclose sqlite openclose 2580/s -- -98% sqlite 121065/s 4593% --

The code is:

#!/usr/bin/perl # trymany - compare open/write/close with db access #vim: syntax=perl use v5.14; use warnings; use Benchmark qw( :all ); use File::Path qw( make_path ); use DBI; sub make_file_name { #FUNCTION $top -> $name my ($top) = @_; my $f = $top . '/' . sprintf "%04d", rand(10000); return $f; } sub make_dir { #FUNCTION $dir -> $dir my ($dir) = @_; return $dir if (-d $dir); my $ok = make_path($dir) or die "cannot mkdir $dir: $!\n"; return $dir; } my $db_file = "db.sqlite3"; my $dsn = "dbi:SQLite:$db_file"; my $table = "testtab"; my $column = "data"; my $insert_sql = "insert into $table ($column) values (?)"; my $create_sql = "create table $table ($column varchar)"; my $commit_every = 1000; my $uncommitted = 0; my $record = 'x' x 80; my $create_table = (-f $db_file) ? 0 : 1; my $top = "dirs"; make_dir($top) or die "cannot mkdir $top $!\n"; my $n = (shift @ARGV) || 100000; my $seed = (shift @ARGV) || 12523477; srand($seed); warn "connecting to $dsn ...\n"; my $dbh = DBI->connect($dsn, '', '', { AutoCommit => 0, PrintError => 1, RaiseError => 1, }) or die "cannot connect to $dsn: $!\n"; warn "connected to $dsn\n"; if ($create_table) { $dbh->do($create_sql) or die "cannot create table $DBI::errstr +\n"; warn "created table\n"; } my $sth = $dbh->prepare($insert_sql) or die "cannot prepare: $DBI::err +str\n"; my $first_sqlite = 1; warn "ready to begin\n"; cmpthese( $n, { openclose => sub { state $dir = make_dir("$top/openclose"); my $f = make_file_name($dir); open my $fh, ">>$f" or die "cannot open $f for append; + $!\n"; defined(print $fh $record) or die "cannot write $f: $! +\n";; close($fh) or die "cannot close $f: $!\n"; }, sqlite => sub { $sth->execute($record) or die "cannot insert: $DBI::er +rstr\n"; ++$uncommitted; if ($uncommitted >= $commit_every) { $dbh->commit or die "cannot commit $DBI::errst +r\n"; $uncommitted = 0; } }, });

Replies are listed 'Best First'.
Re^3: Delay when write to large number of file
by Corion (Patriarch) on Jun 24, 2014 at 11:40 UTC

    It seems to me that the point of the program of the OP is to create reports for different customers. I'm not sure how creating one database file with all the customer data will help them.

      It won't work if he insists on processing all the files each time the report is requested, but I doubt that anything will. I assumed, perhaps incorrectly, that collecting the data by customer was a background process.

      I figured that with all the data in the database, making the reports would be easy and fast, assuming that you indexed the table properly. It's possible this doesn't scale either, but the test program can easily be tweaked to tell whether it will.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1091041]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-04-25 17:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found