Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Improving on MS-SQL over DBI and DBD::ODBC

by radiantmatrix (Parson)
on Nov 24, 2004 at 14:51 UTC ( [id://410139]=perlquestion: print w/replies, xml ) Need Help??

radiantmatrix has asked for the wisdom of the Perl Monks concerning the following question:

To my brothers and sisters in the Monestary, whom I love; may you have peace, and may your code be free of errors.

I seek your wisdom on the matter of MS-SQL access in Perl. I have been using DBI with DBD::ODBC with success; however, ODBC is somewhat slow, and I do not require the portability it offers as my applications will only ever connect to MS-SQL (it is a corporate standard).

I seek a faster way to work with MS-SQL server resources, preferrably still using a DBI interface. Or, perhaps, some wise Monk can point out a resource that will help me optimize DBD::ODBC for MS-SQL Servers, especially over a slow WAN link (think ISDN).

Lest I be lax in providing the details required, I am using:

  1. ActivePerl 5.8.4.810
  2. The latest DBI and DBD::ODBC from ASPN
  3. MS SQL Server 2000 SP3

As an example of my current methodology, I do this:

use warnings; use strict; use Statements; # Internal Module use DBI; require DBD::ODBC; my $dbh = DBI->connect('DBI:ODBC:goober') or die "Unable to connect to Goober"; #Use NT Auth $dbh->{PrintError} = 0; $dbh->{AutoCommit} = 0; $dbh->{RaiseError} = 1; # We don't pass these to connect, because SQL Server seems to # randomly ignore them unless we do as above. my $sth; eval { $sth = $dbh->prepare( $QUERY{auth} ) }; if ($@) { die "Could not prepare $QUERY{auth}: $DBI::errstr ($@)"; } while (my $row = $sth->fetchrow_hashref) { #do things... }

Most of my code requires that I open a connection and then perform several SELECT, INSERT and/or DELETE statements; so, any optimization must have gains for all of those. Update: I do already realize savings by opening only one connection, then using it for all my statements.:Update

I humbly thank the Monestary for its wisdom.


radiantmatrix
require General::Disclaimer;
Perl is

Replies are listed 'Best First'.
Re: Improving on MS-SQL over DBI and DBD::ODBC
by mpeppler (Vicar) on Nov 24, 2004 at 15:28 UTC
    Usually, database program slowness is due to the database (i.e. the SQL code executing on the server) and not on the access method - so the first thing I'd do is verify that your SQL statements perform correctly (correct query plans, good index usage, etc).

    The second thing, given a slow network connection, is to keep as much of the operations on the server. Avoid pulling data from the server for processing in perl and then reinserting the data on to the server - instead write SQL to do this so that the data stays on the server.

    Michael

      I would agree with mpeppler. In my experience, you can usually make the application perform more work on the database, and this can reduce the amount of data you're pushing around, especially over an ISDN line.

      Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds.

      So if you have multiple connections in your program, it might be worthwhile to "cache" the database handle somewhere and then only connect once. Unfortunately, I don't know much about the application overall, so its hard to suggest architectural changes when one doesn't know the architecture to begin with :).

      Finally, most of the time I find myself attempting to optimize something that's not a problem. Do you have mathmatical evidence (ie hard numbers) that show that the ODBC driver is solely the cause of your performance issues? I'm not saying that you're not correct, but I just want to make sure you're not trying to solve the wrong problem.


      --
      Ben
        Also, I don't know how quickly you can connect to the database, but I've seen new connections to an oracle database take up to 20 seconds.
        MS-SQL (and Sybase) is pretty good in that respect - creating a new connection is a relatively fast operation, though that doesn't invalidate your point - reconnecting over and over is not a good thing (especially over a slow link)

        Michael

        Hm, thank you for the SQL construction suggestions. I have tried to push as much as possible onto the server, but I'm not an SQL expert; perhaps I should find one to take a look at my statements. Unfortunately, I'm not sure there's a whole lot of optimization I can do in this manner, as a large portion of my work is to get DB records and compare them with text files or other data sources.

        I don't ever open multiple connections, except when there are multiple instances of my application (which can't be helped), so that isn't the cause of my woes. One db handle is enough, thank you. ;-)

        I can't prove that it's the ODBC driver that's the problem. I do know that performing an identical query using a third-party SQL tool is much faster (even when I account for connection times). I also know that native access methods are often faster as opposed to ODBC, so I'm hoping to find a DBD driver that uses MS-SQL natively. I can at least benchmark, then.

        Finally, while my performance is "acceptable" to my users and my employer, I consider it borderline; I'm wanting to refactor before adding yet more DB things.

        I wish I could just post the code, but NDA and all that... so please ask me anything you need to help you formulate a starting point for me to research.


        radiantmatrix
        require General::Disclaimer;
        Perl is

Re: Improving on MS-SQL over DBI and DBD::ODBC
by runrig (Abbot) on Nov 24, 2004 at 17:22 UTC
    fetchrow_hashref is known to be the slowest way to fetch rows. I like this example from the DBI docs when there could be alot of rows:
    $sth->execute; my %row; $sth->bind_columns( \( @row{ @{$sth->{NAME_lc} } } )); while ($sth->fetch) { print "$row{region}: $row{sales}\n"; }

      Hm... I will have to analyze this. I often work with 60k+ row counts. Thank you!


      radiantmatrix
      require General::Disclaimer;
      Perl is

        The number of rows won't matter, this loops over the rows individually, just as fetchrow_hashref() does. The difference is that bind_cols() takes away some of the hash-creation overhead. The DBI docs unambiguously state that it is faster than fetchrow_hashref().
Re: Improving on MS-SQL over DBI and DBD::ODBC
by VSarkiss (Monsignor) on Nov 24, 2004 at 15:47 UTC

    If you're on Win32, one alternative is DBD::ADO. There are some different limitations versus ODBC, but overall it works well.

    That said, I have to echo what others have also written. It's very unlikely that any slowness you're experiencing is from DBI or any of the DBD layers. The culprit is most often a data model that's unsuitable for what you're trying to do, or code that doesn't work well with a given model (doing row-by-row processing is often a problem, for instance).

    It's hard to answer your question in general terms. You may want to post some code and/or design ideas, and ask for specifics.

Re: Improving on MS-SQL over DBI and DBD::ODBC
by Grygonos (Chaplain) on Nov 24, 2004 at 15:25 UTC

    well most of what I'm going to say is standard DBI fare. If you're going to be inserting multiple records into the same table a structure like this is preferred for speed and security.

    my %items_to_insert = (item1=>1, item2=>2); $statement = "INSERT INTO myTbl (itemName,itemNumber) VALUES(?,?); my $sth = $dbh->prepare($statement); foreach my $key (keys(%items_to_insert)) { $sth->execute($key,$items_to_insert{$key}); }
    That prevents the statement from being repeatedly prepare by the database handle.

    I would say that one call to a fetch_all type statement in DBI rather than individual row fetching would be preferred as well. You minimize the total network traffic that way (correct me if wrong here)

    So in summary, my big points of advice for DBI in this case are

    • use placeholders so you only have to prepare the INSERT or DELETE statements once.
    • Use fetchall methods in lieu of fetchrow

      Of course I already use placeholders and such, but I typically only run a given SELECT statement once in this application, so it doesn't help much.

      As to the fetchall recommendation, I would usually agree; but I occasionally recieve in excess of 60,000 records and do so with several instances. Resources considerations become important.


      radiantmatrix
      require General::Disclaimer;
      Perl is

Re: Improving on MS-SQL over DBI and DBD::ODBC
by MrCromeDome (Deacon) on Nov 24, 2004 at 15:48 UTC
    I haven't worked much with it, but have you considered using DBD::ADO as an alternative? Reasoning being that ADO is a Microsoft-created data access protocol, and therefore may be somewhat more speedy than using ODBC. And by using DBD::ADO, you could keep all of your existing code, and all future code benefits from what DBI provides.

    Give it a shot, you have nothing to lose. Wish I had some benchmarks for you to confirm my theories ;)

    Good luck!
    MrCromeDome

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://410139]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (8)
As of 2024-04-19 08:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found