Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

MySQL Select Speed Optimisation

by tachyon (Chancellor)
on Mar 27, 2003 at 09:50 UTC ( [id://246196]=perlquestion: print w/replies, xml ) Need Help??

tachyon has asked for the wisdom of the Perl Monks concerning the following question:

I have a table in a MySQL database that I need to optimise the speed of searching on. It looks like this:

CREATE TABLE some_table ( key CHAR(50), val1 CHAR(10), val2 INT, PRIMARY KEY (key) )

The select that has to run against it is very simple and looks like

SELECT * FROM some_table WHERE key = ? OR key = ? OR key = ?

There are no like operators and the statement handles are all prepare cached and cached themselves with the bind values passed to the execute as needed. Implicit in the MySQL docs is the notion that a B-tree index is automatically created for the primary key. It is not however explicit. So the question is does a primary key need an index to optimise search speed?

Having just written this the obvious answer is just to test it I guess. Still I am lazy enough to seek the wisdom of the monks as a shortcut...

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Replies are listed 'Best First'.
Re: MySQL Select Speed Optimisation
by gmax (Abbot) on Mar 27, 2003 at 10:04 UTC

    The "OR" operator will slow down your query mercilessly. However, using the "IN" operator, MySQL will use the primary key (or any associated index) and run very fast.

    SELECT * FROM some_table WHERE key IN ( ?, ?, ?);

    An alternative to using placeholders is to prepare this statement using a join.

    my @searchkeys = qw(a b c d); my $query = qq{ SELECT * FROM some_table WHERE key IN ( } . join(",", map( {$dbh->quote($_)} @searchkeys)) . qq{)};

    update (1). See also Using OR in SELECT statments safely.

    update (2). Some benchmarking, to show that the IN operator is truly faster. I ran the query against a large table containing 500,000 records.

    #!/usr/bin/perl -w use DBI; use strict; use Benchmark; my $dbh = DBI->connect("DBI:mysql:chess_db;host=localhost" . ";mysql_read_default_file=$ENV{HOME}/.my.cnf", undef, undef, {RaiseError => 1}) or die "can't connect\n"; my $query = qq{SELECT count(*) from PGN_base_data where }; my @searchkeys = ((100_001 .. 100_800)); my $or_query = $query . join( " OR " , map( { "ID = $_"} @searchkeys)); my $in_query = $query . "ID IN (" . join( "," , @searchkeys) . ")"; my $sth_or = $dbh->prepare($or_query); my $sth_in = $dbh->prepare($in_query); sub get_it { my $sth = shift; $sth->execute; my $result = $sth->fetchrow_arrayref; $sth->finish; #print $result->[0], "\n";; } timethese (2000, { 'or' => sub {get_it $sth_or}, 'in' => sub {get_it $sth_in} }); $dbh->disconnect(); __END__ with 400 keys in the query Benchmark: timing 4000 iterations of in, or... in: 17 wallclock secs ( 0.37 usr + 0.09 sys = 0.46 CPU) or: 52 wallclock secs ( 0.70 usr + 0.08 sys = 0.78 CPU) with 800 keys in the query Benchmark: timing 2000 iterations of in, or... in: 17 wallclock secs ( 0.30 usr + 0.10 sys = 0.40 CPU) or: 95 wallclock secs ( 0.41 usr + 0.06 sys = 0.47 CPU)

    Note. There is no possible comparison with UNION, because it would return several rows, which should be added up in the client.
    My test query is only asking for a COUNT(*), thus returning only one row.

     _  _ _  _  
    (_|| | |(_|><
     _|   
    

      I am caching the $sth which are generated with a $dbh->prepare_cached( $sql ) and these get used over and over again (it is a long running daemon not a CGI) thus the desire to avoid the join syntax - as well as the quoting issues. I have read up on IN which is not (currently) part of my limited SQL syntax and it looks like just the ticket. I'll benchmark it tomorrow. Thanks.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        Note that prepared queries are only supported by MySQL since 4.1.0. Moreover I'm not sure DBD::mysql supports new API for prepared queries. It is very likely that prepare_cached in your case doesn't give any perfomance benefits over raw prepare.

        --
        Ilya Martynov, ilya@iponweb.net
        CTO IPonWEB (UK) Ltd
        Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
        Personal website - http://martynov.org

        I'm not sure about MySQL, but some optimizers will just interpret IN as a set of OR clauses. If the number of choices is fixed, another alternative is to do three queries, or, to also keep the number of round-trips to the database down, form a UNION of them:

        SELECT * FROM table WHERE key = ? UNION ALL SELECT * FROM table WHERE key = ? UNION ALL SELECT * FROM table WHERE key = ?
        YMMV greatly; measuring is the only way to tell whether this will help or hurt.

Re: MySQL Select Speed Optimisation
by simon.proctor (Vicar) on Mar 27, 2003 at 11:00 UTC
    I don't know if this is of any help but the query optimisation docs are here. Specifically you could use the explain syntax to determine if indexes are valid and perhaps use analyze table as well. The explain method would also allow you to compare and optimise different types of queries.

    There are other little snippets like running OPTIMIZE TABLE if you delete lots of rows (and you are using a MyISAM table).

    HTH

    SP
Re: MySQL Select Speed Optimisation
by dws (Chancellor) on Mar 27, 2003 at 17:31 UTC
    I have a table in a MySQL database that I need to optimise the speed of searching on.

    It looks like gmax nailed your specific question, so I'll toss in a bit of general advice. For playing in the SQL optimization sandbox, I've found Gulutzan and Pelzer's SQL Performance Tuning to be quite useful. It discusses the general theory of how RDBMSs work under the covers (including details on cost-based query optimization and the performance characteristics of various table implementations), then contrasts how 8 different RDMBSes (including MySQL, Oracle, and SQL Server) respond to various query adjustments and optimizations. It's great fodder for reasoning through performance problems.

Re: MySQL Select Speed Optimisation
by davis (Vicar) on Mar 27, 2003 at 09:55 UTC

    Update: This isn't what you asked: I misread your question.

    From the MySQL docs:

    mysql> SHOW INDEX FROM mytable FROM mydb; mysql> SHOW INDEX FROM mydb.mytable;
    Running this myself shows it does do an index automatically
    cheers

    davis
    It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://246196]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (1)
As of 2024-04-25 03:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found