Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: copying mysql table data to oracle table

by chacham (Prior)
on Aug 25, 2017 at 13:54 UTC ( [id://1198017]=note: print w/replies, xml ) Need Help??


in reply to Re: copying mysql table data to oracle table
in thread copying mysql table data to oracle table

While the logic is sound, it is not a good suggestion, and for more than one reason.

First: you say to start a transaction and run 3 queries. This is not a good idea, for a couple reasons itself. One, another user may insert a record between the check and the insert. Two, a user may delete the record between the check and the update. In both cases the transaction will fail, causing the record to be lost, unless you lock the table with either lock table or select for update. These are rather inefficient methods.

Second: Checking if a record exists for the purpose of a separate insert or update, outside of a where clause, is absolute silliness. Indeed, almost any insert or update that may cause a collision should be using a where clause anyway. Using insert...where not exists and update...where not exists, is an atomic statement and both the where clause and the insert or update clause will be able to used the cache for record checking. If there check is done in a different statement, while it is likely to still be cached, it may not be, because it is now done and over with and may have been unloaded by the database.

Third: we're taking Oracle here, and in Oracle, there is MERGE ("upsert") statement, which does exactly what the op wants to do.

Fourth: your suggestion includes using non-sql checking, which means either perl or pl/sql. This would include the invoking of an engine (pl/sql or perl) outside of sql, which is not the most efficient method. Being this requires sql (well, unless data loading is used, but i do not think that is one of the options here), and can be done completely in sql, to use anything other than sql (even pl/sql!) is redundant and slower.

Fifth: committing the transaction is not only redundant, it will slow down the entire operation, and removes the ability of rollback of more than one record.

Sixth: Even if this would be used, a more efficient method would be a bulk insert.

Anyway, the logic in your post is how a programmer might think. But this is a database, where we work with sets. Think set-based logic, not record-based. That's just plain inefficient.

  • Comment on Re^2: copying mysql table data to oracle table

Replies are listed 'Best First'.
Re^3: copying mysql table data to oracle table
by Anonymous Monk on Aug 25, 2017 at 14:49 UTC
    In both cases the transaction will fail, causing the record to be lost...
    Everybody knows that when a transaction fails, you restart it from the beginning. You don't just drop your change on the floor. That would be silly.
    ...your suggestion includes using non-sql checking...
    So you're imagining that we can put the entire update in one big blob of SQL and run it without checking for any error messages? That seems... optimistic. But at least we can agree that using a perl-based cache "to speed things up" is a Bad Idea, right?
    committing the transaction is not only redundant, it will slow down the entire operation, and removes the ability of rollback of more than one record.
    So correct me if I'm wrong, but if you hold too many row locks on a table, doesn't Oracle automatically upgrade it to a table lock? And isn't holding a table lock for the duration of the import operation potentially disruptive to a busy database? Maybe we need more context than the OP has provided.
    in Oracle, there is MERGE ("upsert") statement, which does exactly what the op wants to do.
    I thought Oracle might have added such a feature, but I didn't know what it was called. Thanks for pointing this out.

      Everybody knows that when a transaction fails, you restart it from the beginning. You don't just drop your change on the floor. That would be silly.

      The point is, there's nothing you can do about it, being you haven't locked the table or records.

      So you're imagining that we can put the entire update in one big blob of SQL and run it without checking for any error messages? That seems... optimistic. But at least we can agree that using a perl-based cache "to speed things up" is a Bad Idea, right?

      If the statement is insert where not exists, there should be no errors, unless something unrelated (like not enough space) crops up. In which case you do not want to handle the error automatically.

      The cache i referred to in the database cache where table are often kept for future statements.

      if you hold too many row locks on a table, doesn't Oracle automatically upgrade it to a table lock? And isn't holding a table lock for the duration of the import operation potentially disruptive to a busy database? Maybe we need more context than the OP has provided.

      If you lock a table with lock table, or records with select for update, iirc, noone else can touch those records. During a single transaction, however, other users are running off the redo cache. So, it's a bit different.

      We should not need more context, being this is a pretty standard import operation.

        The point is, there's nothing you can do about it, being you haven't locked the table or records.
        I don't understand your statement at all. If your transaction fails, you go back to the beginning, meaning you start a new transaction, fetch the record again, and decide what to do about it. Maybe the intervening update obviated the need for your change, and maybe it didn't. But the point is, that's what you can do about failed transactions. You just retry them.
        The cache i referred to in the database cache where table are often kept for future statements.
        No, I'm talking about the OP's original idea of starting off with a "select * from emp" and throwing the whole thing in a big perl hash for later reference. Bad Idea, right? Right?
        If the statement is insert where not exists, there should be no errors...
        There could be any sort of column constraint violation, like an out-of-range value or a missing foreign key. You at least want to be able to tell the user, "here are the records that didn't get updated."
        We should not need more context...
        The context we need is: can we just lock the table and prevent anybody else from making any updates until we're done, or is that going to cheese too many people off?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1198017]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-20 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found