fastest method to use DBI

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: fastest method to use DBI by BrowserUk (Patriarch) on Jul 07, 2009 at 07:03 UTC
The whole point of a `SELECT` statement is to select (only) the records you need! If you truly cannot do with anything less that the whole 100,000 rows, and if you need to reload them often enough that the load performance is a problem, then you would be better storing them (or cacheing) them somewhere that does not carry the overhead of the DB select and communications--eg. a file. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP PCW	[reply] [d/l]
Re: fastest method to use DBI by moritz (Cardinal) on Jul 07, 2009 at 07:26 UTC
I think that a `fetchall_arrayref` is very memory intensive and not a good idea. I guess that DBI or the underlying drives either batch operations automatically, or have options to do so, or don't do it because it's not efficient. Either way I don't think that your "manual" caching will do very much good. Also you're using `->bind_columns(\$number, \$id,\$start_dat,\$end_dat);` which means that after each `->fetch()` the variables `$number, $id` etc have the new values. This is efficient, but only if you use `->fetch()`, not `->fetchall_arrayref` (which duplicates the work).	[reply] [d/l] [select]
Re: fastest method to use DBI by targetsmart (Curate) on Jul 07, 2009 at 05:01 UTC
I entered into perlmonks with the same question. :) see What are effective methods for retriving many number of rows from postgreSQL tables using perl DBI? Vivek -- 'I' am not the body, 'I' am the 'soul', which has no beginning or no end, no attachment or no aversion, nothing to attain or lose.	[reply]
Re^2: fastest method to use DBI by Anonymous Monk on Jul 07, 2009 at 05:21 UTC
Hi, It would be great help if get an expert comment on the code. Thanks	[reply]
Re^3: fastest method to use DBI by targetsmart (Curate) on Jul 07, 2009 at 05:41 UTC
is it the right way to get the fastest fetch? I don't know, you have run it and give some statistics on the time taken, memory consumption of the perl program, etc. It would be great help if get an expert comment on the code I am not a database expert, but IMHO using the database query's OFFSET AND LIMIT is the best way to fetch huge number of records, it is will take only minimum memory(depending on the offset and limit values), and effective values for limit will enable you to fetch all records in a optimum number of fetch cycles. Vivek -- 'I' am not the body, 'I' am the 'soul', which has no beginning or no end, no attachment or no aversion, nothing to attain or lose.	[reply]
Re^4: fastest method to use DBI by zwon (Abbot) on Jul 07, 2009 at 19:19 UTC
Re: fastest method to use DBI by JavaFan (Canon) on Jul 07, 2009 at 09:54 UTC
Without knowing which driver you are using, most certainly not. The way you are doing it duplicates data - you are using bind variables and you are using `fetchall_arrayref`. Furthermore, you first fetch every row in a big table, then you loop over it (in a, IMO, weird way) and slightly rearrange it. You're doing quite a lot of data duplications, even before you do anything with the data. First you have to answer the question: do I need all 100000 rows before I do any processing? Or do you want to process each row? In which case you're (probably) better off fetching a row at the time, doing the processing, then fetching the next. Note I say probably - if the processing takes a long time, you are holding resources (perhaps even locks) in the database which may influence other processes accessing the data. AFAIK, bind variables and `fetchrow_arrayref` are the fastests way to retrieve data - with bind variables probably the fastest (but I haven't benchmarked it myself, and it may vary between drivers). I never use bind variables, as I don't like its action at a distance, but if fetching was the bottleneck of a time critical program, I'd certainly look into it.	[reply]
Re^2: fastest method to use DBI by dsheroh (Monsignor) on Jul 07, 2009 at 14:28 UTC
Yes, `bind_columns`/`->fetchrow_arrayref` (or its alias, `->fetch`) is the fastest way to retrieve data, per the DBI docs: [fetchrow_arrayref] Fetches the next row of data and returns a reference to an array holding the field values. Null fields are returned as undef values in the array. This is the fastest way to fetch data, particularly if used with `$sth->bind_columns`. Perhaps the OP misread this as saying that the fastest option was `fetchall_arrayref` rather than `fetchrow_arrayref`? The recommended technique, then, would be: `my $sth = $dbn->prepare("select number,id,start_dat,end_dat from SUBSC +RIBERSLIST"); $sth->execute(); $sth->bind_columns(\$number, \$id, \$start_dat, \$end_dat); my %hash =(); while ($sth->fetch) { # ->fetch populates the variables from ->bind_columns push @{$hash{$number}}, [$id, $start_dat, $end_dat]; }; $sth->finish(); $dbn->disconnect;` [download] But, as already noted, you should also `SELECT` only the rows you need and do your processing line-by-line instead of sucking in the whole table at once if possible. I've only addressed the mechanics of how the OP is pulling the `SELECT`ed rows.	[reply] [d/l] [select]
Re^3: fastest method to use DBI by perrin (Chancellor) on Jul 07, 2009 at 15:34 UTC
I suspect what the OP saw was something about fetchall_arrayref bering the fastest way to fetch all the data, which is true since it doesn't require looping in perl. However, fetchall_arrayref doesn't work with bind_columns. Also, since the OP is only after a single row, there is no advantage to fetchall_arrayref.	[reply]
Re^4: fastest method to use DBI by zwon (Abbot) on Jul 07, 2009 at 18:54 UTC
Re^5: fastest method to use DBI by perrin (Chancellor) on Jul 07, 2009 at 19:25 UTC


There's more than one way to do things
	PerlMonks