Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Batch Loading of data through DBI

by curtisb (Monk)
on Jan 20, 2005 at 21:25 UTC ( [id://423797]=perlquestion: print w/replies, xml ) Need Help??

curtisb has asked for the wisdom of the Perl Monks concerning the following question:

All,
I have recently made a couple of posting related to using DBI to insert and update data in a database. From working with the files I've been using I have noticed one thing. That the inserts are going line by line.
So, my question to you is:
Is there any way to do an insert of more than one line (batch loading) of data into the database from a file?

Thanks in advance,
Bobby

Replies are listed 'Best First'.
Re: Batch Loading of data through DBI
by erix (Prior) on Jan 20, 2005 at 21:40 UTC

    Most databases have dedicated tools that do bulk loading of files. You use sybase (I think), and for sybase you would use bcp. (search sybase.com and, even better: sybooks)

    MySQL has mysqlimport that does the same thing.

    Oracle's bulk loader is sqlldr.

    All these tools are typically much faster than DBI.

    If you are trying to make your DBI data loading faster, make sure AutoCommit is off and that you are not committing too often (only every few hundred inserts).

    It will also help to remove indexes and recreate them after the load.

      Another way to make an import go faster is - if your database allows it - to temporarily turn off logging.
        Another (safer) way is to drop all but the clustered index. When you're done, re-create all the other indicies. In most situations though, it's not worth the extra overhead.

        thor

        Feel the white light, the light within
        Be your own disciple, fan the sparks of will
        For all of us waiting, your kingdom will come

        Turning off logging should only be done on initial loads of the database. Turning off logging runs the risk of corruption if something should go wrong during the bulk copy of the data. IMHO, the slight performance improvement of loading the data is not worth the risk.

        Jason L. Froebe

        Team Sybase member

        No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Re: Batch Loading of data through DBI
by moot (Chaplain) on Jan 20, 2005 at 21:41 UTC
    Do you mean processing more than one line per SQL statement? or are you talking about transactions?

    If the former, you could think about loading N lines into a buffer, then passing the buffer to $dbh->do(), but results will depend on the actual driver you are using (i.e. which DBMS).

    If the latter, transaction availability is again dependent on the DBMS you are using.

    You could also investigate the use of pre-prepared statement handles, depending on the format of data within the file - if you have the values themselves (as opposed to complete INSERT or UPDATE statements), pre-preparing your statement handles will save some time.

    Outside perl, some database engines (MySQL for one) will allow use of 'extended' SQL, where multiple entries can be added using one INSERT statement.

    Perhaps you could clarify your question.

Re: Batch Loading of data through DBI
by mkirank (Chaplain) on Jan 21, 2005 at 10:23 UTC
Re: Batch Loading of data through DBI
by CountZero (Bishop) on Jan 21, 2005 at 07:31 UTC
    In MySQL there is the LOAD DATA INFILE function to do what you want and it can be called by the usual DBI-syntax.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Batch Loading of data through DBI
by samgold (Scribe) on Jan 21, 2005 at 20:26 UTC
    As I found out SQL*Loader, Oracle's tool, is much much faster than using DBI. You can check out Best method to load a csv file. Your best bet is to use the tools that your Database provides. In my case I used sqlldr and a PL/SQL procedure to load my data. It takes less than a minute to load the data into a temp table and then run the procedure to load it into the production table.

    HTH,
    Sam Gold
Re: Batch Loading of data through DBI
by Anonymous Monk on Jan 21, 2005 at 16:51 UTC
    As mentioned earlier, transactions allow savings in file access time usually (especially in the uber-leet SQLite driver). Another thing to try is to do $dbh->prepare() on the insert query with param placeholders. Most drivers will then cache the compiled state of the query, cutting out a chunk of used time later on.
Re: Batch Loading of data through DBI
by zebedee (Pilgrim) on Jan 23, 2005 at 18:36 UTC
    mysqldump generates the following ... LOCK TABLES makes WRITE; INSERT INTO makes VALUES (1,'Acura'),(2,'Alfa Romeo'),(3,'Aston Martin'),(4,'Audi'),...(83,'Lifan'); UNLOCK TABLES; I do not know if the INSERT INTO syntax is MySQL-specific or not. (The "..." is where I cut out a whole list of items.)
Re: Batch Loading of data through DBI
by naChoZ (Curate) on Jan 24, 2005 at 01:12 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://423797]
Approved by Arunbear
Front-paged by castaway
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2024-04-26 00:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found