Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

DBI do vs prepare

by BrianC (Acolyte)
on Aug 13, 2006 at 22:02 UTC ( [id://567119]=perlquestion: print w/replies, xml ) Need Help??

BrianC has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, According to the documentation, it is more efficient to do (I'm paraphrasing):
$sth = $dbh->prepare($sql); foreach (@ValueList) { $sth->execute($_); }
than
foreach (@ValueList) { $dbh->do("Interpolated sql statement"); }
So I'm trying to figure out why, in the following example, Code1, which uses the prepare strategy, benchmarks slower than Code2, which uses repeated dos? BTW, in the actual program the @Result array is different for each element of the @Took array, hence the apparent circumlocution.

Thanks in advance, Brian

#!/usr/bin/perl -w use strict; use Benchmark qw/:all/; my $times = shift; sub UpdateDBIar { my ($arsql) = @_; use DBI; my $dbh = DBI->connect("dbi:mysql:database=SchoolToolsDev;host=loc +alhost;user=stadmin;password=stadmin") or die "Couldn't connect to da +tabase $DBI::errstr\n"; foreach my $sql (@$arsql) { $dbh->do($sql) or die "Couldn't execute sql: $sql $dbh->errstr +"; } } sub UpdateDBIar2 { my ($arargs) = @_; use DBI; my $dbh = DBI->connect("dbi:mysql:database=SchoolToolsDev;host=loc +alhost;user=stadmin;password=stadmin") or die "Couldn't connect to da +tabase $DBI::errstr\n"; foreach my $argset (@$arargs) { my $sql = $argset->[0]; my $artemp = $argset->[1]; my $sth = $dbh->prepare($sql); foreach my $arg (@$artemp) { $sth->execute(join ',', @$arg); } } } my $crCode1 = sub { my @Result = (10734..10845); my @Took = (qw/Baseline Q1 Q2 Q3 Q4/); my $arargs; foreach my $Took (@Took) { my $sql = "UPDATE Tests SET $Took = '1' WHERE ID = ?"; my $artemp; foreach (@Result) { push @$artemp, [$_]; } push @$arargs, [$sql, $artemp]; } UpdateDBIar2($arargs); }; my $crCode2 = sub { my @Result = (10734..10845); my @Took = (qw/Baseline Q1 Q2 Q3 Q4/); my @sqls; foreach my $Took (@Took) { foreach (@Result) { push @sqls, "UPDATE Tests SET $Took = '1' WHERE ID = '$_'" +; } } UpdateDBIar(\@sqls); }; timethese ($times, { Code1 => $crCode1, Code2 => $crCode2, } );

Replies are listed 'Best First'.
Re: DBI do vs prepare
by chromatic (Archbishop) on Aug 13, 2006 at 22:31 UTC
    So I'm trying to figure out why, in the following example, Code1, which uses the prepare strategy, benchmarks slower than Code2, which uses repeated dos?

    Your benchmark code has a severe flaw in that it measures quite a bit of argument packing and unpacking besides the DBI calls.

      Not a clue as to why, but I agree with chromatic about your benchmark. In general whether prepare is worth doing depends on what you are talking to. The logic is that some databases have a very heavy overhead for setting up a query, which is what prepare does. So, for repetitive queries you suffer the pain once only. However, this isn't true for all databases, and you may only benefit if you have a lot of repetitive queries to work on.

      To progress you need a much purer benchmark to test exactly what you plan to do in the real world. This should be on of the last things you do in the course of development. This avoids the curse of Premature optimization!

      !-- Node text goes above. Div tags should contain sig only -->
      -- Anthony Staines

        I'm still not sure how I'd test without supplying the arguments. Perhaps you could give me an example?

        Actually I am trying to optimize an already functioning program. Code2 and UpdateDBIar are from the current program logic, but I thought I'd try something with the prepare strategy to see if it would be faster.

        Brian

      I'm not sure how to test the function without packing parameters. And I certainly can't use it without packing parameters. I guess it doesn't really matter if the "prepare/execute" is faster if the cost of implementation is considerably higher.

        I didn't test this code, but I rewrote it to be a lot more efficient and to copy a lot less information. I expect it to run substantially more quickly.

        sub UpdateDBIar2 { my ($queries, $args) = @_; use DBI; my $dbh = DBI->connect( "dbi:mysql:database=SchoolToolsDev;host=localhost;user=stadmin;passwor +d=stadmin" ) or die "Couldn't connect to database $DBI::errstr\n"; for my $query ( @$queries ) { my $sth = $dbh->prepare($query); for my $arg (@$args) { $sth->execute( $arg ); } } } my $crCode1 = sub { my @result = ( 10734 .. 10845 ); my @queries = map { "UPDATE Tests SET $_ = '1' WHERE ID = ?" } qw( Baseline Q1 Q2 Q3 Q4 ); UpdateDBIar2( \@queries, \@result ); };
Re: DBI do vs prepare
by graff (Chancellor) on Aug 14, 2006 at 00:12 UTC
    I think the statement about "prepare(); loop { execute() }" being faster / more efficient than "loop { do() }" was meant to apply to this sort of comparison (untested):
    use strict; use DBI; use Benchmark qw/timethese/; my $times = shift; my @vals = ( 10734..10845 ); my $sql = "UPDATE Tests SET foo = 'bar' WHERE ID = ?"; my $dbh = DBI->connect("connection_string..."); my $sth = $dbh->prepare( $sql ); sub dbi_do { for my $val ( @vals ) { (my $dosql = $sql) =~ s/\?/'$val'/; $dbh->do( $dosql ); } } sub dbi_prep { for my $val ( @vals ) { $sth->execute( $val ); } } timethese( $times, { DbiDo => \&dbi_do, PrepExec => \&dbi_prep, });
    Of couse, that sort of demonstration also highlights a good reason, besides speed, for favoring the "prepare/execute" approach: using placeholders in the SQL statement tends to be better and easier than using quoted strings.

    (updated to fix the "use Benchmark" line)

Re: DBI do vs prepare
by jdtoronto (Prior) on Aug 14, 2006 at 14:38 UTC
    BrianC,

    There is no hard and fast rule about just how well the prepare, execute, execute ... and the do variations will work. Generally speaking the $dbh->do( ... ) will create and destroy a new statement handle for each use. Using the $dbh->prepare( .. ) will almost inevitably result in better performance, particularly on databases which do actually cache the statement. Sadly, I don't think mysql does. At least prior to MySQL 4.1.3 it did not cache prepared statements, more recently you can enable the caching of prepares by enabling the mysql_server_prepare option. This is a connection option and is detailed in the latest DBD::mysql documentation ( 3.0006 as I write this ). That being said, their is also the issue of placeholders you raised, DBD::mysql emaulates placeholders because I understand they are not directly supported by the MySQL API. In carefully crafted code there can be an advantage with MySQL, but not as great as with Oracle, for example.

    jdtoronto

      jdtoronto,

      I added the connection option, but there was no change in performance. I think the problem is with the lack of support for placeholders.

      Thanks,

      Brian

Re: DBI do vs prepare
by jeanluca (Deacon) on Aug 14, 2006 at 11:16 UTC
    Using a simple sql query like this I can imagine the performance profit of prepare might not be visible. So I would just suggest to use a very complex SQL statement. I'm sure you'll see the difference!

    Cheers
    LuCa
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://567119]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2024-04-25 11:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found