Where the advice to use DBI bind parameters can go wrong (long)by dws (Chancellor)
|on Jun 14, 2002 at 19:46 UTC||Need Help??|
Every few days somebody wanders in with a DBI problem involving hand-built queries. And just as often, someone will respond with the suggestion that they use bind parameters. This is almost always good advice, though it sets up a landmine for the unlucky soul who tries to bind a value for a LIKE clause.
Here's the problem: Parameters in SQL queries let you do two things. First, they let you avoid the details of quoting string values*, and avoid having to escape any characters within a string that the RDMBS wants to have escaped. This lets you avoid the problem of doing $sth->prepare("SELECT * FROM companies WHERE name = '$name'"); where $name happens to be "O'Reilly". Oops. Malformed query.
Rewriting this as $sth->prepare("SELECT * FROM companies WHERE name = ?"); lets DBI/DBD handle the quoting when a value for name is supplied at execute() time. You don't have to worry about the details, including portability details in the event that your code is set down atop a different RDBMS with different quoting rules.
The second benefit is performance, if you happen to be using a database that does its query parsing and execution at prepare() time. (Oracle does, MySQL doesn't (yet)). It takes time for a database to parse a query and figure out how to handle it (depending on the query, indexes can be used or not, and the order in which tables are scanned can matter). If you can take this hit once, you're almost always better off.
Here's the landmine.
Say you've coded up a search page that allows the user to specify the start of a name. You'll report on all matches. Instead of writing $sth->prepare("SELECT name, url FROM table WHERE name LIKE '$name%'); you've taking the standard advice, and have instead written
Knowing a bit about databases, you've built a separate index on table.name so that the search will go fast. This seems to work fine when you test, and so you ship. Eventually a customer gripes that your system is really slooooow. So slow that their browsers are timing out. They paid a bunch of money for Oracle, and (hopefully?) a bunch of money for your software, and are very, very unhappy.
What's going on?
Eventually a DBA looks at the query plan that Oracle is using for your query, and notices that it is peforming a linear scan of table, which, for this customer, is Very Big. You didn't notice this because your test data wasn't anywhere near as large (a problem worth its own rant). But why the linear scan? You have an index on the table.name. What's going on?
The problem is this: When the Oracle prepared the query and examined the LIKE clause, it could tell that there was an index on table.name. If Oracle had 'smith%' in hand at that point, it would have determined that the index was safe to use ('smith' could be used as a partial key). But without the argument in hand, the Oracle's query planner had to make a worst-case assumption. Worse case means a linear scan. Linears scans of large data tables suck.
If you'd built the query by hand, Oracle would have used your index, and the query would have been quick. But, since you took the standard advice and used bind parameters, performance sucked.
Now you know.
*crazyinsomniac reminds me that you can use DBI::quote() when hand building queries.