http://qs321.pair.com?node_id=309654

sgifford has asked for the wisdom of the Perl Monks concerning the following question:

I'm doing a query that returns many rows, using DBI and DBD::mysql. I expect the $sth->execute to get the query started and return almost immediately, and then one of the fetchrow methods to actually get the results. Instead, execute seems to gather all of the results before returning, taking a long time to finish and using up over 1GB of memory.

Is there a way to work around this?

Here's my code. It gets to "Executing..." and then sits there for a very long time taking up memory; based on other tests it will eventually return, and after that the fetchrows will work.

#!/usr/bin/perl -w BEGIN { $ENV{MYSQL_UNIX_PORT}='/clair8/corpora/aledb/run/socket'; }; use strict; use DBI; my $sql = "SELECT url1.urlid AS from_id, url1.url AS from_url, link1.l +ink_text AS link1_text, link1.linkid AS link1_id, url2.urlid AS to_id +, url2.url AS to_url FROM DOTGOV_links AS link1, DOTGOV_urls AS url1, DOTGOV_urls AS url2 WHERE (link1.link_from = url1.urlid) AND (link1.link_to = url2.urlid)"; warn "Connecting...\n"; my $dbh = DBI->connect('DBI:mysql:database=clair', 'root','', {AutoCom +mit => 1}) or die "Error creating dbh.\n"; warn "Connected!\n"; warn "Preparing...\n"; my $sth = $dbh->prepare($sql) or die "Error preparing sth.\n"; warn "Prepared!\n"; warn "Executing...\n"; $sth->execute or die "Error executing sth.\n"; warn "Executed!\n"; warn "Finishing...\n"; $sth->finish or die "Error finishing sth.\n"; warn "Finished!\n";

Replies are listed 'Best First'.
Re: DBI/mysql gathering all rows on execute
by gmax (Abbot) on Nov 24, 2003 at 20:39 UTC

    There are two possible causes, perhaps both of them at the same time:

    • If there is no index in any of link1.link_from, url1.urlid, link1.link_to, and url2.urlid, then the join takes much longer than it should. If this is the case, add an appropriate index, using an ALTER TABLE statement.
    • Your query is returning an huge amount of rows, which are clogging both your server and client memory. The reason is that MySQL default mode of fetching is All-At-Once (mysql_store_result). You can affect this behavior by setting the appropriate mode identifier after preparing the statement:
      my $sth = $dbh->prepare($sql); $sth->{"mysql_use_result"} = 1;
      Be aware that this mode requires that you either fetch all the records or use a "finish" method before issuing another request to the server.
      See MySQL manual for an explanation of this behavior (it's related to the C API) and DBD::mysql for more details on this modifier in the DBD interface.
      With "mysql_use_result", as opposed to "mysql_store_result", the server won't send any record to the client until you use a "fetch" method. What happens with the default mode is that the server sends all the records at once, and the "fetch" method is just browsing a local copy of the data.
     _  _ _  _  
    (_|| | |(_|><
     _|   
    

      Thanks, it was $sth->{"mysql_use_result"} = 1; that I needed!

      The right indexes are all in place (and EXPLAIN SELECT agrees), there are just a huge number of rows in the result set.

      Now here's the other weird thing, which doesn't matter for my real project but is still unusual. When I make this change, the test program I posted above works better, but now it hangs for a long time during the finish. That surprises me; I would expect it to cancel the query and return immediately. Anybody know what's going on?

Re: DBI/mysql gathering all rows on execute
by talexb (Chancellor) on Nov 24, 2003 at 20:12 UTC

    If the execute is taking a long time, I would suggest putting something like a LIMIT ?, OFFSET ? at the end of your query, insert useful values into the appropriate placeholders (shudder) and use that approach to break the query up into bite-sized chunks. That should limit memory consumption.

    --t. alex
    Life is short: get busy!
      This method can be problematic. If there is any other activity on the table then the results will not be returned in the same order on succesive accesses and so you have a chance of missing some records completely or getting some twice. It is in fact possible, but not common, for consecutive executes to return the recirds in a different order even when no other activity has taken place on the table, particularly when using Perl 5.8.1 or later.

      jdtoronto

Re: DBI/mysql gathering all rows on execute
by VSarkiss (Monsignor) on Nov 24, 2003 at 20:44 UTC

    If execute is taking a long time, that's an indication that your query is problematic. Remember, the database has to identify the result set before it will start to return any rows. Try running your query interactively (using command line or MySQL/CC or phpMyAdmin or something similar). If it's slow, you'll need to fix that first.

    It's impossible to tell without knowing the data model and schema, but just from your variable names, I'm guessing you're building a cartesian product from a many-to-many relationship. That can produce a huge number of rows, so make sure you've restricted the set that you're selecting from (in other words, add more conditions to your where clause). If you can't restrict it, you may be SOL: you're asking for a ton of data, but you don't want to take any time to retrieve it, which is an impossible combination.

      (using command line or MySQL/CC or phpMyAdmin or something similar).

      like perlmyadmin

Re: DBI/mysql gathering all rows on execute
by princepawn (Parson) on Nov 24, 2003 at 20:08 UTC
    This is besides the point of your post, but consider turning on the RaiseError attribute and then you can get rid of all your or die code, which is actually lacking a print of $DBI::errstr anyway. From the DBI docs:
    RaiseError (boolean, inherited) The RaiseError attribute can be used to force errors to raise exceptions rather than simply return error codes in the normal way. It is "off" by default. When set "on", any method which results in an error will cause the DBI to effectively do a die("$class $method failed: $DBI::errstr"), where $class is the driver class and $method is the name of the method that failed.

    PApp::SQL and CGI::Application rock the house