Improving on MS-SQL over DBI and DBD::ODBC

radiantmatrix has asked for the wisdom of the Perl Monks concerning the following question:

To my brothers and sisters in the Monestary, whom I love; may you have peace, and may your code be free of errors.

I seek your wisdom on the matter of MS-SQL access in Perl. I have been using DBI with DBD::ODBC with success; however, ODBC is somewhat slow, and I do not require the portability it offers as my applications will only ever connect to MS-SQL (it is a corporate standard).

I seek a faster way to work with MS-SQL server resources, preferrably still using a DBI interface. Or, perhaps, some wise Monk can point out a resource that will help me optimize DBD::ODBC for MS-SQL Servers, especially over a slow WAN link (think ISDN).

Lest I be lax in providing the details required, I am using:

ActivePerl 5.8.4.810
The latest DBI and DBD::ODBC from ASPN
MS SQL Server 2000 SP3

As an example of my current methodology, I do this:

use warnings;
use strict;
use Statements; # Internal Module

use DBI; require DBD::ODBC;

my $dbh = DBI->connect('DBI:ODBC:goober')
   or die "Unable to connect to Goober";  #Use NT Auth
$dbh->{PrintError} = 0;
$dbh->{AutoCommit} = 0;
$dbh->{RaiseError} = 1;
# We don't pass these to connect, because SQL Server seems to
# randomly ignore them unless we do as above.

my $sth;
eval { $sth = $dbh->prepare( $QUERY{auth} ) }; if ($@) {
   die "Could not prepare $QUERY{auth}: $DBI::errstr ($@)";
}

while (my $row = $sth->fetchrow_hashref) {
   #do things...
}
[download]

Most of my code requires that I open a connection and then perform several SELECT, INSERT and/or DELETE statements; so, any optimization must have gains for all of those. Update: I do already realize savings by opening only one connection, then using it for all my statements.:Update

I humbly thank the Monestary for its wisdom.

radiantmatrix
require General::Disclaimer;
Perl is

Comment on Improving on MS-SQL over DBI and DBD::ODBC Download Code

Replies are listed 'Best First'.
Re: Improving on MS-SQL over DBI and DBD::ODBC by mpeppler (Vicar) on Nov 24, 2004 at 15:28 UTC
Usually, database program slowness is due to the database (i.e. the SQL code executing on the server) and not on the access method - so the first thing I'd do is verify that your SQL statements perform correctly (correct query plans, good index usage, etc). The second thing, given a slow network connection, is to keep as much of the operations on the server. Avoid pulling data from the server for processing in perl and then reinserting the data on to the server - instead write SQL to do this so that the data stays on the server. Michael	[reply]
Re^2: Improving on MS-SQL over DBI and DBD::ODBC by bprew (Monk) on Nov 24, 2004 at 16:33 UTC
I would agree with mpeppler. In my experience, you can usually make the application perform more work on the database, and this can reduce the amount of data you're pushing around, especially over an ISDN line. Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds. So if you have multiple connections in your program, it might be worthwhile to "cache" the database handle somewhere and then only connect once. Unfortunately, I don't know much about the application overall, so its hard to suggest architectural changes when one doesn't know the architecture to begin with :). Finally, most of the time I find myself attempting to optimize something that's not a problem. Do you have mathmatical evidence (ie hard numbers) that show that the ODBC driver is solely the cause of your performance issues? I'm not saying that you're not correct, but I just want to make sure you're not trying to solve the wrong problem. -- Ben	[reply]
Re^3: Improving on MS-SQL over DBI and DBD::ODBC by mpeppler (Vicar) on Nov 24, 2004 at 16:52 UTC
Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds. MS-SQL (and Sybase) is pretty good in that respect - creating a new connection is a relatively fast operation, though that doesn't invalidate your point - reconnecting over and over is not a good thing (especially over a slow link) Michael	[reply]
Re^3: Improving on MS-SQL over DBI and DBD::ODBC by radiantmatrix (Parson) on Nov 24, 2004 at 19:00 UTC
Hm, thank you for the SQL construction suggestions. I have tried to push as much as possible onto the server, but I'm not an SQL expert; perhaps I should find one to take a look at my statements. Unfortunately, I'm not sure there's a whole lot of optimization I can do in this manner, as a large portion of my work is to get DB records and compare them with text files or other data sources. I don't ever open multiple connections, except when there are multiple instances of my application (which can't be helped), so that isn't the cause of my woes. One db handle is enough, thank you. ;-) I can't prove that it's the ODBC driver that's the problem. I do know that performing an identical query using a third-party SQL tool is much faster (even when I account for connection times). I also know that native access methods are often faster as opposed to ODBC, so I'm hoping to find a DBD driver that uses MS-SQL natively. I can at least benchmark, then. Finally, while my performance is "acceptable" to my users and my employer, I consider it borderline; I'm wanting to refactor before adding yet more DB things. I wish I could just post the code, but NDA and all that... so please ask me anything you need to help you formulate a starting point for me to research. radiantmatrix `require General::Disclaimer;` Perl is	[reply]
Re^4: Improving on MS-SQL over DBI and DBD::ODBC by mpeppler (Vicar) on Nov 24, 2004 at 20:01 UTC
Re^4: Improving on MS-SQL over DBI and DBD::ODBC by bprew (Monk) on Nov 25, 2004 at 22:24 UTC
Re: Improving on MS-SQL over DBI and DBD::ODBC by runrig (Abbot) on Nov 24, 2004 at 17:22 UTC
fetchrow_hashref is known to be the slowest way to fetch rows. I like this example from the DBI docs when there could be alot of rows: `$sth->execute; my %row; $sth->bind_columns( \( @row{ @{$sth->{NAME_lc} } } )); while ($sth->fetch) { print "$row{region}: $row{sales}\n"; }` [download]	[reply] [d/l]
Re^2: Improving on MS-SQL over DBI and DBD::ODBC by radiantmatrix (Parson) on Nov 24, 2004 at 19:04 UTC
Hm... I will have to analyze this. I often work with 60k+ row counts. Thank you! radiantmatrix `require General::Disclaimer;` Perl is	[reply]
Re^3: Improving on MS-SQL over DBI and DBD::ODBC by jZed (Prior) on Nov 24, 2004 at 19:11 UTC
The number of rows won't matter, this loops over the rows individually, just as fetchrow_hashref() does. The difference is that bind_cols() takes away some of the hash-creation overhead. The DBI docs unambiguously state that it is faster than fetchrow_hashref().	[reply]
Re: Improving on MS-SQL over DBI and DBD::ODBC by VSarkiss (Monsignor) on Nov 24, 2004 at 15:47 UTC
If you're on Win32, one alternative is DBD::ADO. There are some different limitations versus ODBC, but overall it works well. That said, I have to echo what others have also written. It's very unlikely that any slowness you're experiencing is from DBI or any of the DBD layers. The culprit is most often a data model that's unsuitable for what you're trying to do, or code that doesn't work well with a given model (doing row-by-row processing is often a problem, for instance). It's hard to answer your question in general terms. You may want to post some code and/or design ideas, and ask for specifics. Do not rebuke them with harsh words ... but rather lead them gently - with URLs - so that they may learn wisdom.	[reply]
Re: Improving on MS-SQL over DBI and DBD::ODBC by Grygonos (Chaplain) on Nov 24, 2004 at 15:25 UTC
well most of what I'm going to say is standard DBI fare. If you're going to be inserting multiple records into the same table a structure like this is preferred for speed and security. `my %items_to_insert = (item1=>1, item2=>2); $statement = "INSERT INTO myTbl (itemName,itemNumber) VALUES(?,?); my $sth = $dbh->prepare($statement); foreach my $key (keys(%items_to_insert)) { $sth->execute($key,$items_to_insert{$key}); }` [download] That prevents the statement from being repeatedly prepare by the database handle. I would say that one call to a `fetch_all` type statement in DBI rather than individual row fetching would be preferred as well. You minimize the total network traffic that way (correct me if wrong here) So in summary, my big points of advice for DBI in this case are use placeholders so you only have to prepare the INSERT or DELETE statements once. Use `fetchall` methods in lieu of `fetchrow` Grygonos	[reply] [d/l] [select]
Re^2: Improving on MS-SQL over DBI and DBD::ODBC by radiantmatrix (Parson) on Nov 24, 2004 at 18:53 UTC
Of course I already use placeholders and such, but I typically only run a given SELECT statement once in this application, so it doesn't help much. As to the `fetchall` recommendation, I would usually agree; but I occasionally recieve in excess of 60,000 records and do so with several instances. Resources considerations become important. radiantmatrix `require General::Disclaimer;` Perl is	[reply]
Re: Improving on MS-SQL over DBI and DBD::ODBC by MrCromeDome (Deacon) on Nov 24, 2004 at 15:48 UTC
I haven't worked much with it, but have you considered using DBD::ADO as an alternative? Reasoning being that ADO is a Microsoft-created data access protocol, and therefore may be somewhat more speedy than using ODBC. And by using DBD::ADO, you could keep all of your existing code, and all future code benefits from what DBI provides. Give it a shot, you have nothing to lose. Wish I had some benchmarks for you to confirm my theories ;) Good luck! MrCromeDome	[reply]


Pathologically Eclectic Rubbish Lister
	PerlMonks