Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

SQLite Slow insert

by mmittiga17 (Scribe)
on Feb 17, 2013 at 19:00 UTC ( [id://1019191]=perlquestion: print w/replies, xml ) Need Help??

mmittiga17 has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I am trying to have a script that uses DBI & sqlite. Task at hand is to take data files and check if record exist in db already, if not insert into db table. Problem is it is taking for ever and moving slow. This is my first attempt at the sqlite & Perl. Below is my code if someone can tell me what I am doing wrong or advise how to achieve better performance, I would be grateful.

my $dbh = DBI->connect( "dbi:SQLite:dbname=SSBHLD2.db", "", "", { RaiseError => 1 }, ) or die $DBI::errstr; $dbh->do("CREATE TABLE IF NOT EXISTS HLDdata(Data TEXT)"); while(<IN>){ $line = $_; my $sth = $dbh->prepare( "select count(*) from HLDdata where Data = + '$line'"); $sth->execute(); my ($data) = $sth->fetchrow(); print "$data \n"; if ($data ne 0){ print "record exist not adding\n"; &Logit("record exist not adding"); }else{ $dbh->do("INSERT INTO HLDdata(Data) VALUES ('$line')"); $line =~s/\|/\t/g; print "$line\n"; print OUT "$line"; } } close(IN); close(OUT);

Replies are listed 'Best First'.
Re: SQLite Slow insert
by afoken (Chancellor) on Feb 17, 2013 at 19:24 UTC

    Some ideas:

    • Move the prepare out of the while loop. No need to prepare the same statement more than once.
    • Use placeholders to get rid of SQL injection problems, and to improve performace by allowing to cache the prepared statement inside the database engine.
    • prepare the insert statement outside the while loop, and use placeholders, for the same reasons.
    • Disable auto-commit, and commit manually every few thousand rows, as recommended by the SQLite FAQ.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: SQLite Slow insert
by Corion (Patriarch) on Feb 17, 2013 at 19:21 UTC

    As you don't show us how your database is constructed, or how you open the DB connection, it's somewhat hard to guess.

    I suggest looking at the Performance section of the DBD::SQLite documentation. Consider whether loading the data in an atomic transaction is OK for you. Also see http://bobby-tables.com/, because your way of constructing the insert statement is liable to fail with embedded single quotes. Use placeholders there.

      [...] your way of constructing the insert statement is liable to fail with embedded single quotes. Use placeholders there.

      The select statement has the same problem.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: SQLite Slow insert
by Anonymous Monk on Feb 17, 2013 at 23:19 UTC

      Mostly right, but don't use DBI's quote() method. Use placeholders. The quote() method should be considered private to DBI and DBD::*, because it is not needed outside DBI and DBD::*. See Re^5: Variable interpolation in a file to be read in.

      Using placeholders allows DBI, DBD::whatever, and the database behind DBD::whatever to cache the SQL statement passed to prepare (and prepare_cached) and especially its parsed form. Interpreting SQL with placeholders is needed exactly once, no matter how often you execute the SQL command. Interpreting SQL polluted with actual values is required every time you execute that SQL command. So for one million inserts, you can avoid 999_999 runs of the SQL interpreter by using the results from the first interpreter run by using placeholders. Simply not running complex code in 99.9999% of all cases makes your program run faster.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        Thank you for your replay guys!!! I think the issue is trying to check if data exists first before inserting, so it seems I have to do one row of data at a time, unless there is a way to test if data row exists then how I was doing it. I am reading all of the suggested links everyone has passed along, This is my first crack on using a DB like this.

        my $sth = $dbh->prepare("INSERT INTO HLDdata (Data) VALUES (?)"); foreach $txt (@TXT) { print "$txt\n"; open(IN,"$txt") || warn("cant open $txt"); while(<IN>){ $line = $_; my $sth = $dbh->prepare( "select count(*) from HLDdata where Data += '$line'"); $sth->execute(); my ($data) = $sth->fetchrow(); print "$data \n"; if ($data ne 0){ print "record exist not adding\n"; &Logit("record exist not adding"); }else{ my ($Data) = $line; $sth->execute(); $line =~s/\|/\t/g; # print "$line\n"; print OUT "$line"; } }
Re: SQLite Slow insert
by Plankton (Vicar) on Feb 18, 2013 at 07:51 UTC
    Why not put an unique constraint on the HLDdata column?
Re: SQLite Slow insert
by digglife (Acolyte) on Aug 14, 2014 at 03:33 UTC
    Try turning off autocommit.
    my $dbh = DBI->connect( "dbi:SQLite:dbname=SSBHLD2.db", "", "", { RaiseError => 1, Autocommit => 0 }, ) or die $DBI::errstr;
    Then add $dbh->commit(); when you finish processing the data.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1019191]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (7)
As of 2024-04-19 13:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found