Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

DBD::Oracle insert row performance

by pme (Monsignor)
on Jan 02, 2021 at 20:16 UTC ( [id://11126161]=perlquestion: print w/replies, xml ) Need Help??

pme has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

Module DBD::Oracle has serious performance degradation at inserting rows. Except execute_array (or bind_param_array), all other insert solutions are pretty slow (eg. bind by name). DBD::Oracle use otlv4 (http://otl.sourceforge.net/otl3_intro.htm) as wrapper over Oracle's OCI interface and the basic http://otl.sourceforge.net/otl3_ex10.htm is much (~10 times) faster than DBD::Oracle row based insertion. DBD::Oracle implements DATA_AT_EXEC mode with OCIBindDynamic for row based insertion (not array) as can be seen below and this block is repeated for each value.

... dbd_bind_ph(1): bind :id <== '1774684594' (type 0 (DEFAULT (varchar))) dbd_rebind_ph() (1): rebinding :id as '17746...' (not-utf8, ftype 1 (V +ARCHAR), csid 0, csform 0(0), inout 0) dbd_rebind_ph_char() (1): bind :id <== '17746...' (size 10/12/0, ptype + 3(VARCHAR), otype 1 ) Changing maxlen to 12 dbd_rebind_ph_char() (2): bind :id <== '17746...' (size 10/12, otype 1 +(VARCHAR), indp 0, at_exec 1) bind :id as ftype 1 (VARCHAR) OCIBindByName(163ad08,1675e58,1614d20,":id",placeh_len=3,value_p=1 +675a00,value_sz=12,dty=1,indp=1675e78,alenp=0,rcodep=1675e70,maxarr_l +en=0,curelep=0 (*=0),mode=DATA_AT_EXEC,2)=SUCCESS OCIBindDynamic(163a970,1614d20,1675e20,7fdf910dacc0,1675e20,7fdf91 +0daf30)=SUCCESS OCIAttrGet(163a970,OCI_HTYPE_BIND,1675e3c,0,31,1614d20)=SUCCESS dbd_rebind_ph(): bind :id <== '17746...' (in, not-utf8, csid 873->0->8 +73, ftype 1 (VARCHAR), csform 0(0)->0(0), maxlen 12, maxdata_size 0) OCIAttrSet(163a970,OCI_HTYPE_BIND, 7ffda0f7f960,0,Attr=31,1614d20) +=SUCCESS ...
Is this design intentional? OCI documentation said that OCIBindDynamic can be useful at working with big data items https://web.stanford.edu/dept/itss/docs/oracle/10gR2/appdev.102/b14250/oci05bnd.htm#i427753. C++ code using otlv4 can process blob/clob data and it does not use OCIBindDynamic at all. So, why DBD::Oracle implemented this way?

Thanks in advance

Update: details added Re^2: DBD::Oracle insert row performance

Replies are listed 'Best First'.
Re: DBD::Oracle insert row performance
by Marshall (Canon) on Jan 03, 2021 at 01:36 UTC
    When inserting to a DB, the very most important thing is the number of transactions per second. If you try to run a transaction per insert, this will just "auger into the ground" performance wise. The DB will be limited by transactions per second. You can easily insert one million rows in a single transaction.

    I have no experience with Oracle, but the basic principle I mentioned above is true for all SQL DB's.

    Update: A DB transaction leaves the HD in a known state - no matter what happens during the transaction. This requires multiple writes and reads with associated rotational hard disk delays. A million row insert may not take much longer than a single row insert. That is because of all of this "book keeping" is related to number of transactions, not the amount of data or the number of rows.

      Hi marshall,

      Thanks for your comment. This issue is definitely not transaction related. autocommit is switched off.

Re: DBD::Oracle insert row performance
by marto (Cardinal) on Jan 03, 2021 at 11:20 UTC

    Hey there, I only use Oracle when they pay me to. That said have you made any tweaks to ora_array_chunk_size or ora_check_sql from the defaults. Can you provide an example of exactly what you're doing?

      hi marto,

      Please find below my tiny SQL script, perl sample script and also C++/otlv4 sample code.

      In my environment the perl script for 10000 rows runs about 5secs for each insertion mode except the third one (testins3). testins3 calls execute_array method that uses Oracle OCI's array interface DBD::Oracle and runs in a split second. But this is a different business. The C++ code runs in a split second too. It also based on OCI calls. Of course, C++ code must be faster than its perl equivalent but not that extent. I also played with database handler parameters (see the perl code below) without any kind of success.

      Now, my consideration is that the reason of this poor performance of the perl script is unnecessary and/or inappropriate OCI calls in DBD::Oracle. Unfortunately I am not an Oracle OCI interface expert. BTW, I have never seen such an ugly database API like OCI. I try to understand how otlv4 (http://otl.sourceforge.net/otl3_intro.htm) is implemented and compare it to oci8.c and dbdimp.c in module DBD::Oracle with no success until now. otlv4 does not call OCIBindDynamic at all and use OCI_DEFAULT mode instead of DATA_AT_EXEC.

        There is a problem here:
        my $dbh = DBI->connect("dbi:Oracle:$dbname", $user, $passwd, { AutoCommit => 0, RaiseError => 1, RowCacheSize => 10000, ora_array_chunk_size => 100000, #ora_check_sql => 0, #ora_verbose => 0xff, #ora_oci_success_warn => 1, } );
        To my understanding, AutoCommit => 0; doesn't mean much. The default action will be to "commit" each insert operation no matter what this says, if you do not start a transaction explicitly.

        You need to start a transaction explicitly, insert huge amount of rows and then end the transaction explicitly. As I have suggested before, about 128K rows per commit will work fine.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11126161]
Approved by haukex
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-24 01:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found