How to improve MYSQL search performance of perl?

nan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to improve MYSQL search performance of perl? by radiantmatrix (Parson) on Aug 18, 2005 at 15:24 UTC
This is not a Perl question, but a DB-optimization question (unless you want to write Perl code to intelligently combine queries and extract the data back out -- but that would probably still be slow). Make sure you have normalized your DB for the type of queries you wish to perform, and index the columns you are searching through. For example, if you have a table with columns: ID Name Age Details SSN Record_file and you commonly search for records by Name and SSN, create indexes on those columns. Intelligent use of indexes, along with proper normalization, nets in huge speed gains in many circumstances. <-radiant.matrix-> Larry Wall is Yoda: there is no `try{}` (ok, except in Perl6; way to ruin a joke, Larry! ;P) The Code that can be seen is not the true Code "In any sufficiently large group of people, most are idiots" - Kaa's Law	[reply]
Re: How to improve MYSQL search performance of perl? by davidrw (Prior) on Aug 18, 2005 at 15:27 UTC
Initial thought is that you need to examine your queries and your table structure and indicies. 500MB isn't all that much if the data is index'd properly. Can you provide some sample queries and schema?	[reply]
Re^2: How to improve MYSQL search performance of perl? by nan (Novice) on Aug 19, 2005 at 15:32 UTC
Hi, As I need to read line by line and search them in the database, I used a subroutine to handle all database work. Below is my code. Read more... Subroutine (869 Bytes) thanks again, Nan	[reply]
Re^3: How to improve MYSQL search performance of perl? by davidrw (Prior) on Aug 19, 2005 at 17:00 UTC
I would take trammell's suggestion a step further and also not recreate the statement handle every time and actually take advantage of the statement handle (and placeholders)-- i think this will have a decent improvement in the performance (amount of gain is probably db-dependent): my $dbh = DBI->connect('DBI:mysql:diet', {RaiseError => 1, AutoCommit +=> 0} ) \|\| die "Failed to connect: $DBI::errstr"; my $sth = $dbh->prepare( qq{select topic FROM table1 WHERE uri LIKE ?} + ); search($sth, 'foo'); search($sth, 'bar'); $sth->finish(); $dbh->disconnect(); #disconnect from database; sub search{ my $sth = shift; # require statement handle (this could probably be + a global var instead if desired) my $q = shift; # take search parameter from html <form/> my $found = 0; #initialize category found count; $sth->execute($q); my $rows = $sth->fetchall_arrayref( {} ); printf "%d rows found for '%s'.\n", scalar(@$rows), $q; foreach my $row (@$rows){ printf " Topic: %s\n", &topic($row->{topic}); } } [download]	[reply] [d/l]
Re^4: How to improve MYSQL search performance of perl? by nan (Novice) on Aug 24, 2005 at 17:06 UTC
Re^5: How to improve MYSQL search performance of perl? by davidrw (Prior) on Aug 24, 2005 at 17:17 UTC
Some notes below your chosen depth have not been shown here
Re^3: How to improve MYSQL search performance of perl? by trammell (Priest) on Aug 19, 2005 at 15:55 UTC
One improvement you can make is to only open your database handle once at the beginning of the script, and reuse that handle instead of recreating it for each query.	[reply]
Re: How to improve MYSQL search performance of perl? by Anonymous Monk on Aug 18, 2005 at 15:57 UTC
500Mb is a meaningless measurement if you want to get a feel whether your database has many records to search through. It's meaningful when determining disk space need, or when doing backups, but not when you want to indicate you have a lot to search through. A 500 Mb database in a radiology lab probably means it only has one row - with a small image stored in it. The number of rows a query might consider, now, that's an important measurement. The size of a row, both in columns (used in the query) and the total size in bytes are important as well, but much less so. Having said that, 500 Mb is tiny by modern standards. Most desktops, and even many laptops will be able to keep almost the entire database in core memory - if you have a dedicated machine for your database (and you should), put in 1 Gb of RAM, and you'll be sure you have the entire database in memory. But even if you have that, your approach can still be "slow". Whether or not that is significantly improveable depends almost entirely on your database structure (tables, indices), and the queries performed. If the queries could be almost anything, there will be many queries that will not be able to make use of the given indices, resulting to table scans. And even with the entire database in core memory, having to do many table scans will slow down things. But as others said, this is mostly a database question. Consult your local database administrator/guru.	[reply]
Re: How to improve MYSQL search performance of perl? by trammell (Priest) on Aug 18, 2005 at 16:52 UTC
There is great online MySQL documentation available at http://dev.mysql.com/doc/, specifically a chapter on optimizing queries at http://dev.mysql.com/doc/mysql/en/query-speed.html.	[reply]
Re^2: How to improve MYSQL search performance of perl? by nan (Novice) on Aug 19, 2005 at 15:40 UTC
Hi, Many many thanks for that great article but a question just poped up after reading. It says that mySQL will build indexes for the whole table when calling "CREAT TABLE ....." I suppose it means that I don't need to rebuild an index (my table only has two columns). Ok, even I rebuild an index by myself, how it can be used? Thanks again, Nan	[reply]
[OT] Re^3: How to improve MYSQL search performance of perl? by trammell (Priest) on Aug 19, 2005 at 16:07 UTC
MySQL will choose the appropriate index for the tables involved in your query; in my experience it chooses correctly most of the time. I see from another post in this thread that your query is: `select topic FROM table1 WHERE uri LIKE '$q'` [download] You can find out what indexes are used by MySQL in this query by running the command `EXPLAIN SELECT topic FROM table1 WHERE uri LIKE 'something'` [download] where "something" is one of your parameters. You can see what indexes are defined on your table by running the command `SHOW CREATE TABLE table1;` [download]	[reply] [d/l] [select]
Re: How to improve MYSQL search performance of perl? by CountZero (Bishop) on Aug 18, 2005 at 19:51 UTC
Are all these queries similar to each other? I mean is it like: `SELECT * FROM table WHERE field = 1 SELECT * FROM table WHERE field = 10 SELECT * FROM table WHERE field = 75 SELECT * FROM table WHERE field = 3 SELECT * FROM table WHERE field = 8 ...` [download] If that is the case you could probably benefit from using placeholders and using `$sth = $dbh->prepare($statement)` or `$sth = $dbh->prepare_cached($statement)`. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l] [select]
Re^2: How to improve MYSQL search performance of perl? by nan (Novice) on Aug 19, 2005 at 15:27 UTC
Hi CountZero, They are all URLs, for example: http://www.permonks.org/. What I did before is to make the database stuff as a sub routine and call it every time a new line is read as I don't know how to optimize the codes: Read more... Subroutine (869 Bytes) Thanks again, Nan	[reply]
Re^3: How to improve MYSQL search performance of perl? by CountZero (Bishop) on Aug 19, 2005 at 18:32 UTC
I see why it is so slow: you are effectively for every search opening a connection, doing the search for 1 item and then destroying the connection. All this connecting and disconnecting is very time-consuming. You should put your connection stuff in an initialization subroutine, then `prepare` your SQL-statement once, using place-holders as follows: " `my $sth = $dbh->prepare('select topic FROM table1 WHERE uri LIKE ?');`" (added benefit: you don't have to worry about quoting!) and then hand off the `$sth`-variable and the search-argument to your search-subroutine which calls the `execute`-method with the search string as its parameter: `my ($statement_handle, $search_argument)=@_; $statement_handle->execute($search_argument); ...` [download] Do you get the idea! CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l] [select]
Re^4: How to improve MYSQL search performance of perl? by nan (Novice) on Aug 24, 2005 at 17:11 UTC
Re^5: How to improve MYSQL search performance of perl? by CountZero (Bishop) on Aug 24, 2005 at 18:29 UTC
Some notes below your chosen depth have not been shown here