I've been working on a project involving Oracle and DBI. Today I ran into a problem.

Essentially, it comes down to this: should Perl cover for the mistakes of others?

Read the <BACKGROUND> section at the bottom if you want the nitty-gritty.

Today I found what the DBA's and I believe to be a bug with Oracle. I was able to work around it easily using merlyn's highlander solution, but I also installed a command-line switch so that the people in the future can turn it off. I don't think Perl should be relied on to account for an Oracle bug. What's crazy is, most of the team didn't seem interested in whether or not it was an Oracle bug. "Just fix it!!"

The irony is that, with me being a Perl guy, you would think I would rush to find a Perl solution, and that the Oracle guys would rush to find an Oracle solution. Here, it was the opposite. I was the lone voice in the room, urging that the matter be taken up with Oracle. And since the Perl solution was already in place and working, they couldn't accuse me of laziness. I just want the Right Tool to fix the Right Problem. Or maybe just the Same Tool to fix Its Own Problem.


There seems to be a problem with Oracle's (relatively) new EXCHANGE PARTITION feature. This feature allows a partitioned table to exchange one of its partitions with another table, like rerouting a pointer. This means that you can construct (in a temporary table on a separate machine) the data you want to use in the production database, and then swap it into the production database with nearly zero performance degradation. It's a cool feature.

So, we have a table partitioned into a bunch of sections, all of which we want to work on simultaneously. We use Perl to do all the manipulation.

There are multiple fork()ed processes that run a series of steps to manipulate the temporary tables, all eventually leading to the EXCHANGE command. The problem is this: when two tables are ready for the EXCHANGE at the same time, Oracle freezes more often than not.

Looking into the internals, it appears that the EXCHANGE locks the entire table for one exchange, at the same time as the other is trying to get one only on the partition. It's an exclusive, blocking lock, which requires the DBA person to kill the entire process if nothing happens. What I can't figure out is, why doesn't it do the exchange after it gets the lock? It just does nothing. We couldn't even execute a SELECT or DESCribe. One DBA believed it to be a bug with Oracle, and one that probably won't be fixed soon. It seems to me that any time this happens to a database, where no queries are possible, and no error is returned, it's safe to say it's something Oracle should look at.




I'm not much of a DBA, so if anyone has worked with this problem before and knows something about it, /msg me! Thanks merlyn for the highlander solution!