http://qs321.pair.com?node_id=11112980

perldigious has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

So I'm still having major issues with inconsistent/intermittent data fetch failures from my company's Oracle DB. I've been digging, a lot, and I may have narrowed down the issue further based on some earlier suggestions from the Monastery... but I still don't have a solid solution. I think this is an Oracle DB datatype issue, where there is a mismatch between what the DB has and DBD::Oracle is expecting, but being a newbie I'm not sure that's correct. Here is the relevant portion of the script I'm currently trying and using tracing options on to try and debug further.

use strict; use warnings; use DBI; use DBD::Oracle qw(:ora_types); print "Establishing DWASAS connection...\n"; # establish database connection and enable tracing option with dump to + 'tracelog.txt'. my $dsn = 'dbi:Oracle:DWASAS'; my @connection = ($dsn, $user, $pass, {InactiveDestroy => 1, PrintErro +r => 0, RaiseError => 1}); my $dbh = DBI->connect(@connection) or die; open(my $tracelog_fh, '>', 'tracelog.txt') or die "Cannot open \"trace +log.txt\": $!."; $dbh->trace(4, $tracelog_fh); # pepare and execute 'prepack_candidates.sql' print "Preparing 'prepack_candidates.sql'...\n"; my $sth1 = $dbh->prepare($candidates_sql) or die; print "Executing 'prepack_candidates.sql'...\n"; $sth1->execute() or die; print "Fetching 'prepack_candidates.sql' returned data...\n"; print "\n----------\n"; my $print_format = '%7s '. '%-50s '. '%11s '. '%12s' ; printf("$print_format\n", 'ROW_NBR', 'REPORT_PART_NBR', 'SHIPPED_QYT', 'COUNT_AT_QTY' ); my $row_counter = 1; while(my $row = $sth1->fetchrow_hashref) { foreach (keys(%$row)) {$row->{$_} = '' if (!defined($row->{$_}))}; + # change all NULLs to empty strings printf("$print_format\n", $row_counter, $row->{'REPORT_PART_NBR'}, $row->{'SHIPPED_QTY'}, $row->{'COUNT_AT_QTY'} ); $row_counter++; } print "----------\n";

And here is the basic SQL query it's running. And yes, it is successfully running, I can run this same query in 3 other tools: SQL Developer, Alation, and SAS (SAS modified for proc sql))

SELECT * FROM ( SELECT pd.report_part_nbr, idf.shipped_qty, COUNT(pd.report_part_nbr) as count_at_qty FROM ida_main.invoice_detail_fact idf JOIN ida_main.part_dim pd ON pd.part_dim_id = idf.part_dim_id JOIN rdb_main.in_part_header_rec iphr ON iphr.part_id = pd.part_ +id WHERE idf.invoiced_date BETWEEN '01-JAN-2019' AND '31-DEC-2019' AND idf.data_governance_ind = 'T' AND idf.shipped_qty > 0 AND pd.pack_qty > idf.shipped_qty AND pd.pack_type_code IN ('BULK', 'TUBE') AND pd.active_ind = 'T' AND iphr.flag_non_stock = 'F' GROUP BY pd.report_part_nbr, idf.shipped_qty ORDER BY count_at_qty DESC ) WHERE ROWNUM <= 10000

The failure happens during the $sth1->fetchrow_hashref loop. If I try various CASTs in the SQL it will tend to get hung up on different rows of data, but at this point rarely finishes successfully. But here's where things get interesting. The trace file, when it fails the fetch, just looks like this.

1 <> FETCH= ( [ 'REPORT_PART_NBR' 'SHIPPED_QTY' 'COUNT_AT_QTY' ] ) [ +1 items] ('NAME' from cache) at prepack_data.pl line 123 1 -> fetch for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER) thr#2 +7c6fe8 dbd_st_fetch 3 fields... dbd_st_fetched 3 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) -> fetchrow_hashref in DBD::_::st for DBD::Oracle::st (DBI::st=HAS +H(0x370e990)~0x3709260) thr#27c6fe8 1 <> FETCH= ( [ 'REPORT_PART_NBR' 'SHIPPED_QTY' 'COUNT_AT_QTY' ] ) [ +1 items] ('NAME' from cache) at prepack_data.pl line 123 1 -> fetch for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER) thr#2 +7c6fe8 dbd_st_fetch 3 fields... dbd_st_fetched 3 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK) -> fetchrow_hashref in DBD::_::st for DBD::Oracle::st (DBI::st=HAS +H(0x370e990)~0x3709260) thr#27c6fe8 1 <> FETCH= ( [ 'REPORT_PART_NBR' 'SHIPPED_QTY' 'COUNT_AT_QTY' ] ) [ +1 items] ('NAME' from cache) at prepack_data.pl line 123 1 -> fetch for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER) thr#2 +7c6fe8 dbd_st_fetch 3 fields... ...Fetched 0 rows OCIErrorGet after OCIStmtFetch (er1:ok): -1, 3113: ORA-03113: end- +of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 -- HandleSetErr err=3113, errstr='ORA-03113: end-of-file on commun +ication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)', stat +e=undef, undef !! ERROR: 3113 'ORA-03113: end-of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)' (err# +1) 1 <- fetch= ( undef ) [1 items] row777 at prepack_data.pl line 123 !! ERROR: 3113 'ORA-03113: end-of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)' (err# +1) <- fetchrow_hashref= ( undef ) [1 items] row777 at prepack_data.pl + line 123 1 -> FETCH for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER 'Param +Values') thr#27c6fe8 ERROR: 3113 'ORA-03113: end-of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)' (err# +1) 1 <- FETCH= ( HASH(0x375d678)0keys ) [1 items] at prepack_data.pl li +ne 123 -> DESTROY for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER) thr +#27c6fe8 ERROR: 3113 'ORA-03113: end-of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)' (err# +1) <- DESTROY= ( undef ) [1 items] -> DESTROY for DBD::Oracle::db (DBI::db=HASH(0x3709188)~INNER) thr +#27c6fe8 DESTROY DBI::db=HASH(0x3709188) skipped due to InactiveDestro +y ERROR: 3113 'ORA-03113: end-of-file on communication channel Process ID: 60709 Session ID: 2715 Serial number: 34376 (DBD ERROR: OCIStmtFetch)' (err# +0) <- DESTROY= ( undef ) [1 items]

But, way back up at the top, when it's doing the prepare statement, the datatypes don't match what I would expect. When I look at them in the DB tables shown in Alation, Alation says one is a VARCHAR2(50) and the other is a NUMBER(10). I'm not sure what the COUNT I'm doing would become, but I was guessing NUMBER of some size as well.

DBI::db=HASH(0x3709188) trace level set to 0x0/4 (DBI @ 0x0/0) in DBI +1.643-ithread (pid 15588) -> prepare for DBD::Oracle::db (DBI::db=HASH(0x3709278)~0x3709188 +'SELECT * FROM ( SELECT pd.report_part_nbr, idf.shipped_qty, COUNT(pd.report_part_nbr) as count_at_qty FROM ida_main.invoice_detail_fact idf JOIN ida_main.part_dim pd ON pd.part_dim_id = idf.part_dim_id JOIN rdb_main.in_part_header_rec iphr ON iphr.part_id = pd.part_ +id WHERE idf.invoiced_date BETWEEN '01-JAN-2019' AND '31-DEC-2019' AND idf.data_governance_ind = 'T' AND idf.shipped_qty > 0 AND pd.pack_qty > idf.shipped_qty AND pd.pack_type_code IN ('BULK', 'TUBE') AND pd.active_ind = 'T' AND iphr.flag_non_stock = 'F' GROUP BY pd.report_part_nbr, idf.shipped_qty ORDER BY count_at_qty DESC ) WHERE ROWNUM <= 10000 ') thr#27c6fe8 dbd_st_prepare'd sql SELECT ( auto_lob1, check_sql1) dbd_describe SELECT (EXPLICIT, lb 80)... Describe col #1 type=1(VARCHAR) Described col 1: dbtype 1(VARCHAR), scale 0, prec 200, nullok 1, name + REPORT_PART_NBR : dbsize 200, char_used 1, char_size 50, csid 873, csform 1( +SQLCS_IMPLICIT), disize 200 fbh 1: 'REPORT_PART_NBR' NULLable, otype 1-> 5, dbsize 200/2 +01, p200.s0 Describe col #2 type=2(NVARCHAR2) Described col 2: dbtype 2(NVARCHAR2), scale 0, prec 10, nullok 0, nam +e SHIPPED_QTY : dbsize 22, char_used 0, char_size 0, csid 0, csform 0(0), +disize 171 fbh 2: 'SHIPPED_QTY' NO null , otype 2-> 5, dbsize 22/172, p +10.s0 Describe col #3 type=2(NVARCHAR2) Described col 3: dbtype 2(NVARCHAR2), scale -127, prec 0, nullok 1, n +ame COUNT_AT_QTY : dbsize 22, char_used 0, char_size 0, csid 0, csform 0(0), +disize 171 fbh 3: 'COUNT_AT_QTY' NULLable, otype 2-> 5, dbsize 22/172, +p0.s-127 cache settings DB Handle RowCacheSize=0,Statement Handle RowCacheS +ize=0, OCI_ATTR_PREFETCH_ROWS=111, OCI_ATTR_PREFETCH_MEMORY=0, Rows p +er Fetch=111, Multiple Row Fetch=On calling OCIAttrSet OCI_ATTR_CHARSET_FORM with csform=1 (SQLCS_IMPL +ICIT) dbd_describe'd 3 columns (row bytes: 244 max, 108 est avg, cache: +0) <- prepare= ( DBI::st=HASH(0x370e990) ) [1 items] at prepack_data. +pl line 105 -> execute for DBD::Oracle::st (DBI::st=HASH(0x370e990)~0x3709260) + thr#27c6fe8 dbd_st_execute SELECT (out0, lob0)... Statement Execute Mode is 0 (DEFAULT) rs_array_init:imp_sth->rs_array_size=111, rs_array_idx=0, prefetch +_rows=0, rs_array_status=SUCCESS dbd_st_execute SELECT returned (SUCCESS, rpc0, fn4, out0) <- execute= ( '0E0' ) [1 items] at prepack_data.pl line 107 -> fetchrow_hashref in DBD::_::st for DBD::Oracle::st (DBI::st=HAS +H(0x370e990)~0x3709260) thr#27c6fe8 1 -> FETCH for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER 'NAME' +) thr#27c6fe8 1 <- FETCH= ( [ 'REPORT_PART_NBR' 'SHIPPED_QTY' 'COUNT_AT_QTY' ] ) [ +1 items] at prepack_data.pl line 123 1 -> fetch for DBD::Oracle::st (DBI::st=HASH(0x3709260)~INNER) thr#2 +7c6fe8 dbd_st_fetch 3 fields... ...Fetched 111 rows dbd_st_fetched 3 fields with status of 0(SUCCESS) field #1 with rc=0(OK) field #2 with rc=0(OK) field #3 with rc=0(OK)

So, for those Monks that know DBI and especially DBD::Oracle well, or any Monk who has an opinion really... thoughts that may help me?

----------------------------------------------------------------------------------

UPDATE - Resolved:

Okay, so after wasting a huge amount of time over the past few weeks trying to debug all my issues from a Perl client and/or Oracle DB side, I have this resolved thanks to one of my DBAs and on of my Network Engineers at my employer. The issue had nothing to do with Perl, my script, or the Oracle DB.

The DBA, after doing some tracing on my client side and the DB side and speaking with Oracle support, was able to narrow down something in the trace log that led him to believe a firewall or something similar was killing the connection during the data fetching process. So he referred me to the Network Engineers. The Network Engineer had seen similar issues before with other things besides Perl scripts (our Software Engineers do a lot of Python scripting), so he had me try simply physically plugging in via an Ethernet cable instead of going through the wireless network. That immediately fixed ALL the issues I was seeing and all my scripts run perfectly and seamlessly over and over without fail... so far anyway (see, if nothing else I've learned to qualify). :-)

The Network Engineer explained to me that he isn't exactly sure what kills things like this since it's outside his area, but he knows there's something in my employer's wireless network chain where the connections are routed through that has extra layers of security that can be bypassed with a physically wired connection since it doesn't route the connection through the same switches (if I'm reiterating what he said accurately).

Thanks to all the Monks who tried to help me with this despite the actual problem being something completely unrelated (sorry, I don't know what I don't know). I'm going to twist this experience in my mind and try and think of it as me spending the better part of three weeks "learning new things" rather than just having dumped all that time unnecessarily in to a big black hole... it's working now in any case, so I'm happy. :-)

Just another Perl hooker - My clients appreciate that I keep my code clean but my comments dirty.