http://qs321.pair.com?node_id=455124


in reply to DBI speed up needed on MySQL

A hard issue to address with so little information, but I can offer a few tips that may help

* if you can, create a 2nd table in your database that contains just your key record (i am assuming you have one) and the two fields you're basing your query on. You can even create this on the fly as a temporary table which is never written to disk, and then query it. Use your array of key values to fetch all 200 fields one record at a time based on your key value, which should be indexed.

* Don't bother creating an index on mw_export, that will probably slow down your table's performance, especially your write performance (because on write/update the table updates all affected indexes). Indexes don't help on columns that just contain one or two possible values... like yes/no.

* You didn't specify, but perhaps you're loading a big data file from another source, like a CSV into a table before processing the records, and that's why the table design is so horrid? If this is the case, don't create the indexes on the table until after you've loaded the data into it. This way, it just generates it's index once, not every time you insert a record into the table.

* When you *have* to build a table like this, it gets big. When a table is this size, with this many fields and records, very small optimizations make a large differnce over time. Review all of your column types and make sure you are being as efficient as possible. Easy things to check are field sizes compared to data sizes, setting NOT NULL on columns that will not contain null values saves you storage and processing each time that value is accessed. Using CHAR instead of VARCHAR saves you a byte of storage and requires less processing by the server to pull that value. CHAR vs VARCHAR depends a lot on your data, and the engine you're using on that table. You can pick up a good comprehensive MySQL reference like "MySQL by Paul DuBois" from O'Reilly ISBN0-7357-0921-1 (available online if you're a member of safari.oreilly.com). Find as many ways to optimize your columns and data as possible.

* As you process your records one at a time, be sure to retain as little data as possible from the rest of the records in memory. If you're taking each record, storing it's hash reference for later access and moving on, you're forcing your script to keep the entire record set in memory. You want to load one record at a time and when you're done with it, make sure you're letting the garbage collector ditch the data instead of holding onto it by keeping a reference to it's hash defined somewhere.

good luck!