http://qs321.pair.com?node_id=362125


in reply to DBI or mysqlimport?

Use DBI and insert the values as I parse each line

Inserting records one by one is going to be the slowest approach.

However, if you can use the extended insertion syntax that MySQL supports, this could be one of the best solutions.

An extended insertion query looks like this:

INSERT INTO tablename (col1, col2, col3) VALUES (value1, value2, value3), (value1, value2, value3), (value1, value2, value3), (value1, value2, value3);

The bad news is that you need to figure out - with some benchmarks - how many records you should pass to each query, considering the limits set by max_allowed_packet (1 MB by default).

When I had to do something similar, with a max_allowed_packet of 3 MB, I had very good results building queries up to 2 MB. As for the technique to build these queries, there is a detailed example in Speeding up the DBI, but the basic idea is like the following:

my @values = ( ["A", "B", "D", "0"], ["E", "F", "G", "1"], ["H", "I", "J", "2"], ["K", "L", "M", "3"], ["N", "O", "P", "4"], ["Q", "R", "S", "5"], ["T", "U", "V", "6"], ["W", "X", "Y", "7"], ); my $query = "INSERT INTO tablename VALUES \n" . join(",\n", ( map { "(". join ( ", ", map {$dbh->quote($_)} @$_). ")" } @values));
Parse the values into a delimited file and import this using mysqlimport ?

This is also feasible. mysqlimport is one of the most tested tools in MySQL suite, and it is very much efficient, especially if dealing with tab-delimited files. Personally, once I found the right balance with the extended insert, I almost always manage to outperform this method using multi-record queries. YMMV.

Other wisdom ?

Yes. If your table has some indexes, in addition to primary key or unique keys, then you can use this set of instructions before and after the insertion:

ALTER TABLE tablename DISABLE KEYS; # do the insertion ALTER TABLE tablename ENABLE KEYS;

This trick will speed up the insertion considerably. Instead of building the indexes row by row, it will do it all at once at the end of the insertion.

 _  _ _  _  
(_|| | |(_|><
 _|