DBD::Oracle insert row performance

pme has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

Module DBD::Oracle has serious performance degradation at inserting rows. Except execute_array (or bind_param_array), all other insert solutions are pretty slow (eg. bind by name). DBD::Oracle use otlv4 (http://otl.sourceforge.net/otl3_intro.htm) as wrapper over Oracle's OCI interface and the basic http://otl.sourceforge.net/otl3_ex10.htm is much (~10 times) faster than DBD::Oracle row based insertion. DBD::Oracle implements DATA_AT_EXEC mode with OCIBindDynamic for row based insertion (not array) as can be seen below and this block is repeated for each value.

...
dbd_bind_ph(1): bind :id <== '1774684594' (type 0 (DEFAULT (varchar)))
dbd_rebind_ph() (1): rebinding :id as '17746...' (not-utf8, ftype 1 (V
+ARCHAR), csid 0, csform 0(0), inout 0)
dbd_rebind_ph_char() (1): bind :id <== '17746...' (size 10/12/0, ptype
+ 3(VARCHAR), otype 1 )
Changing maxlen to 12
dbd_rebind_ph_char() (2): bind :id <== '17746...' (size 10/12, otype 1
+(VARCHAR), indp 0, at_exec 1)
      bind :id as ftype 1 (VARCHAR)
    OCIBindByName(163ad08,1675e58,1614d20,":id",placeh_len=3,value_p=1
+675a00,value_sz=12,dty=1,indp=1675e78,alenp=0,rcodep=1675e70,maxarr_l
+en=0,curelep=0 (*=0),mode=DATA_AT_EXEC,2)=SUCCESS
    OCIBindDynamic(163a970,1614d20,1675e20,7fdf910dacc0,1675e20,7fdf91
+0daf30)=SUCCESS
    OCIAttrGet(163a970,OCI_HTYPE_BIND,1675e3c,0,31,1614d20)=SUCCESS
dbd_rebind_ph(): bind :id <== '17746...' (in, not-utf8, csid 873->0->8
+73, ftype 1 (VARCHAR), csform 0(0)->0(0), maxlen 12, maxdata_size 0)
    OCIAttrSet(163a970,OCI_HTYPE_BIND, 7ffda0f7f960,0,Attr=31,1614d20)
+=SUCCESS
...
[download]

Is this design intentional? OCI documentation said that OCIBindDynamic can be useful at working with big data items https://web.stanford.edu/dept/itss/docs/oracle/10gR2/appdev.102/b14250/oci05bnd.htm#i427753. C++ code using otlv4 can process blob/clob data and it does not use OCIBindDynamic at all. So, why DBD::Oracle implemented this way?

Thanks in advance

Update: details added Re^2: DBD::Oracle insert row performance

Comment on DBD::Oracle insert row performance Download Code

Replies are listed 'Best First'.
Re: DBD::Oracle insert row performance by Marshall (Canon) on Jan 03, 2021 at 01:36 UTC
When inserting to a DB, the very most important thing is the number of transactions per second. If you try to run a transaction per insert, this will just "auger into the ground" performance wise. The DB will be limited by transactions per second. You can easily insert one million rows in a single transaction. I have no experience with Oracle, but the basic principle I mentioned above is true for all SQL DB's. Update: A DB transaction leaves the HD in a known state - no matter what happens during the transaction. This requires multiple writes and reads with associated rotational hard disk delays. A million row insert may not take much longer than a single row insert. That is because of all of this "book keeping" is related to number of transactions, not the amount of data or the number of rows.	[reply]
Re^2: DBD::Oracle insert row performance by pme (Monsignor) on Jan 03, 2021 at 10:44 UTC
Hi marshall, Thanks for your comment. This issue is definitely not transaction related. `autocommit` is switched off.	[reply]
Re: DBD::Oracle insert row performance by marto (Cardinal) on Jan 03, 2021 at 11:20 UTC
Hey there, I only use Oracle when they pay me to. That said have you made any tweaks to `ora_array_chunk_size` or `ora_check_sql` from the defaults. Can you provide an example of exactly what you're doing?	[reply] [d/l] [select]
Re^2: DBD::Oracle insert row performance by pme (Monsignor) on Jan 03, 2021 at 23:38 UTC
hi marto, Please find below my tiny SQL script, perl sample script and also C++/otlv4 sample code. In my environment the perl script for 10000 rows runs about 5secs for each insertion mode except the third one (`testins3`). `testins3` calls `execute_array` method that uses Oracle OCI's array interface DBD::Oracle and runs in a split second. But this is a different business. The C++ code runs in a split second too. It also based on OCI calls. Of course, C++ code must be faster than its perl equivalent but not that extent. I also played with database handler parameters (see the perl code below) without any kind of success. Now, my consideration is that the reason of this poor performance of the perl script is unnecessary and/or inappropriate OCI calls in DBD::Oracle. Unfortunately I am not an Oracle OCI interface expert. BTW, I have never seen such an ugly database API like OCI. I try to understand how otlv4 (http://otl.sourceforge.net/otl3_intro.htm) is implemented and compare it to oci8.c and dbdimp.c in module DBD::Oracle with no success until now. otlv4 does not call OCIBindDynamic at all and use OCI_DEFAULT mode instead of DATA_AT_EXEC. Read more... (18 kB)	[reply] [d/l] [select]
Re^3: DBD::Oracle insert row performance by Marshall (Canon) on Jan 04, 2021 at 22:44 UTC
There is a problem here: `my $dbh = DBI->connect("dbi:Oracle:$dbname", $user, $passwd, { AutoCommit => 0, RaiseError => 1, RowCacheSize => 10000, ora_array_chunk_size => 100000, #ora_check_sql => 0, #ora_verbose => 0xff, #ora_oci_success_warn => 1, } );` [download] To my understanding, `AutoCommit => 0;` doesn't mean much. The default action will be to "commit" each insert operation no matter what this says, if you do not start a transaction explicitly. You need to start a transaction explicitly, insert huge amount of rows and then end the transaction explicitly. As I have suggested before, about 128K rows per commit will work fine.	[reply] [d/l] [select]


laziness, impatience, and hubris
	PerlMonks