in reply to A profiling surprise ...

Yes, the bottleneck is database interaction.

As you continue to write Perl programs that interact with a database, this will cease to surprise you.

You're on the right track: keep thinking of ways to shift the burden away from the database.

Replies are listed 'Best First'.
Re^2: A profiling surprise ...
by mr_mischief (Monsignor) on May 23, 2008 at 21:00 UTC
    More specifically, shift the IO burden away from the connection between your program and the database.

    • Never ask for more data from the database than your code will actually use. Let the DB winnow it down and return just what is needed. That means using proper queries with WHERE clauses and, if applicable, LIMIT clauses.
    • Don't return unsorted information then sort it, as databases are good at sorting quickly.
    • Use placeholders instead of building new queries from scratch by concatenation, as that helps the DB's execution engine minimize its work (more importantly, using placeholders can make SQL injection attacks much less likely anyway).
    • If you're going to search on a particular column a lot, index it. If you're not searching on it much but you're going to sort on it fairly often, index it anyway.
    • Select a row by primary key if you know it and want just that row.

    I'm sure there are other monks with even better advice, but these will help quite a bit.

    UPDATE: fixed thinko s/WHEN/WHERE/ (thanks, kyle, for pointing that out in a msg).

      For the last I'd rather say, select a row by the clustered index if you know it, it's unique and you want just that row. Not sure other databases use the same terminology as MSSQL, but in MSSQL the clustered index is the index that controls the order in which the data are actually stored so it's the fastest one to use. And it doesn't have to be the primary key.

      I agree completely with the rest!