http://qs321.pair.com?node_id=93659

I've been working on a project involving Oracle and DBI. Today I ran into a problem.

Essentially, it comes down to this: should Perl cover for the mistakes of others?

Read the <BACKGROUND> section at the bottom if you want the nitty-gritty.

Today I found what the DBA's and I believe to be a bug with Oracle. I was able to work around it easily using merlyn's highlander solution, but I also installed a command-line switch so that the people in the future can turn it off. I don't think Perl should be relied on to account for an Oracle bug. What's crazy is, most of the team didn't seem interested in whether or not it was an Oracle bug. "Just fix it!!"

The irony is that, with me being a Perl guy, you would think I would rush to find a Perl solution, and that the Oracle guys would rush to find an Oracle solution. Here, it was the opposite. I was the lone voice in the room, urging that the matter be taken up with Oracle. And since the Perl solution was already in place and working, they couldn't accuse me of laziness. I just want the Right Tool to fix the Right Problem. Or maybe just the Same Tool to fix Its Own Problem.

<BACKGROUND>

There seems to be a problem with Oracle's (relatively) new EXCHANGE PARTITION feature. This feature allows a partitioned table to exchange one of its partitions with another table, like rerouting a pointer. This means that you can construct (in a temporary table on a separate machine) the data you want to use in the production database, and then swap it into the production database with nearly zero performance degradation. It's a cool feature.

So, we have a table partitioned into a bunch of sections, all of which we want to work on simultaneously. We use Perl to do all the manipulation.

There are multiple fork()ed processes that run a series of steps to manipulate the temporary tables, all eventually leading to the EXCHANGE command. The problem is this: when two tables are ready for the EXCHANGE at the same time, Oracle freezes more often than not.

Looking into the internals, it appears that the EXCHANGE locks the entire table for one exchange, at the same time as the other is trying to get one only on the partition. It's an exclusive, blocking lock, which requires the DBA person to kill the entire process if nothing happens. What I can't figure out is, why doesn't it do the exchange after it gets the lock? It just does nothing. We couldn't even execute a SELECT or DESCribe. One DBA believed it to be a bug with Oracle, and one that probably won't be fixed soon. It seems to me that any time this happens to a database, where no queries are possible, and no error is returned, it's safe to say it's something Oracle should look at.

</BACKGROUND>

Thoughts?

MM

I'm not much of a DBA, so if anyone has worked with this problem before and knows something about it, /msg me! Thanks merlyn for the highlander solution!

Replies are listed 'Best First'.
Re: Perl as a bug-shield?
by dgoddard (Initiate) on Jul 04, 2001 at 09:08 UTC
    Oracle Support will typically enter problems like this into the TAR system (Technical Assistance Requust) -- which feeds into the Oracle developers' bug database. The key is, you want to provide a minimal testcase using OCI.

    Fortunately, DBD::Oracle has a feature for exactly this purpose, that turns on OCI-level tracing. You can then trivially make a C program that sends the exact OCI calls to Oracle to demonstrate the problem. All you need to do, is put the following immediately after use DBD::Oracle; in your script:

    use DBI; use DBD::Oracle; DBI->trace(6,"trace_oci.txt");

    Try and make the smallest possible script that demos the problem, and make a SQL script that creates the minimal server-side schema required. Your DBA can then send these two files to Oracle Support.

    Think of it this way -- if someone sends you exactly what you need to be able to reproduce the problem in your own environment and on your own desktop, you're likely to fix that bug much more quickly than one where you need to install a specific version of some software which you may or may not be familiar with (imagine if you had to fix a bug in Mozilla's XPCOM.....)

    Good Luck...

Re: Perl as a bug-shield?
by cforde (Monk) on Jul 04, 2001 at 00:32 UTC
    I was a DB2 DBA in a former life and I agree with you. If the database gets itself into a twist the vendor needs to fix it. It should not be possible for an end user to cause a server outage. Plain and simple. Worst case the server should do a rollback with an error message.

    This sounds like the deadlock (two processes contending for the same resource) detection was not updated to handle this new situation. Oracle certainly needs to know about it.

    Have fun,
    Carl Forde

Re: Perl as a bug-shield?
by eejack (Hermit) on Jul 04, 2001 at 01:09 UTC
    Personally I think if you spent the dollars on a product and it doesn't do what it is supposed to do then you ought to let whomever you bought it from know.

    You should also make whatever *temporary workarounds* you need to get through the day, and perhaps you can send the vendor the temporary workaround (or a description of it) so they completely understand that you understand the problem.

    First rule is always get through the day though - so if perl can bandage a problem until Oracle fixes it properly then perl (once again) is the right tool in the right place at the right time.

    EEjack

Re: Perl as a bug-shield?
by runrig (Abbot) on Jul 04, 2001 at 04:14 UTC
    BTW, when you do call Oracle for support, DO NOT mention perl or else their answer will be "We do not support perl". I.e., reproduce the problem without using perl.

    What's crazy is, most of the team didn't seem interested in whether or not it was an Oracle bug. "Just fix it!!"

    This attitude bugs me. If it IS a bug, then they can maybe supply a patch, make the next version better, etc. Why pay money for support if you don't use it?

Re: Perl as a bug-shield?
by Abigail (Deacon) on Jul 04, 2001 at 03:31 UTC
    Your DBA believes it to be an Oracle bug? What did Oracle say when (s)he called them about the problem?

    -- Abigail

(jptxs)Re: Perl as a bug-shield?
by jptxs (Curate) on Jul 04, 2001 at 16:20 UTC
    just to second your DBA, do not expect swift action. I work for a company where I get to deal with Oracle all the time - both the vendor and the clients. most clients are not apt to adopt new features very quickly and so new features always tend to stay as buggy as day one for quite a while.

    my mentor here told me once that one should "never trust a version 1 from Oracle", and though you could say that about a lot of things, it is especially true of Oracle. Just remember that their real money is made with ERP customers who absolutely do not use the newest features...

    We speak the way we breathe. --Fugazi

Re: Perl as a bug-shield?
by Anonymous Monk on Jul 05, 2001 at 21:59 UTC
      Looking into the internals, it appears that the EXCHANGE locks the entire
        table for one exchange, at the same time as the other is trying to get
      one only on the partition. It's an exclusive, blocking lock, which requires
        the DBA person to kill the entire process if nothing happens. What I
      can't figure out is, why doesn't it do the exchange after it gets the lock?
        It just does nothing. We couldn't even execute a SELECT or DESCribe.
      One DBA believed it to be a bug with Oracle, and one that probably won't be
        fixed soon. It seems to me that any time this happens to a database,
      where no queries are possible, and no error is returned, it's safe to say
        it's something Oracle should look at. 
    

    The command name is EXCHANGE so it seems reasonable it wants to manipulate 2 Tables. Both Tables I expect to be guarded with locks. EXCLUSIVE locks! If a process has one lock, it's of little utility. It needs both. Therefore after it has one lock it tries to get the other. Only to find the other lock hold by a process, that wants to EXCHANGE too. That process in turn needs another lock to complete the EXCHANGE. You can guess which process holds that lock.