Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^4: SELECT LAST_INSERT_ID does not work

by afoken (Chancellor)
on Jun 30, 2017 at 20:55 UTC ( [id://1193953]=note: print w/replies, xml ) Need Help??


in reply to Re^3: SELECT LAST_INSERT_ID does not work
in thread SELECT LAST_INSERT_ID does not work

My problem with LAST_INSERT_ID is that it is not well-defined. https://dev.mysql.com/doc/refman/5.7/en/getting-unique-id.html starts with a definiton and links to more information:

For LAST_INSERT_ID(), the most recently generated ID is maintained in the server on a per-connection basis. It is not changed by another client. It is not even changed if you update another AUTO_INCREMENT column with a nonmagic value (that is, a value that is not NULL and not 0). Using LAST_INSERT_ID() and AUTO_INCREMENT columns simultaneously from multiple clients is perfectly valid. Each client will receive the last inserted ID for the last statement that client executed.

So far, so good. We don't have to worry about parallel access from different clients, that piece of data is stored per connection. Inserting values different from NULL and 0 into an AUTO_INCREMENT column does not change the data. That may be a little bit surprising, but it should be easy to handle. And we get the last generated ID.

Linked from there, we can find https://dev.mysql.com/doc/refman/5.7/en/information-functions.html#function_last-insert-id. Unfortunately, it documents a significantly different behaviour:

With no argument, LAST_INSERT_ID() returns a BIGINT UNSIGNED (64-bit) value representing the first automatically generated value successfully inserted for an AUTO_INCREMENT column as a result of the most recently executed INSERT statement.

And this is even repeated in an "Important" block:

If you insert multiple rows using a single INSERT statement, LAST_INSERT_ID() returns the value generated for the first inserted row only. The reason for this is to make it possible to reproduce easily the same INSERT statement against some other server.

We get the first generated ID for an INSERT statement, not the last one. Wouldn't it be nice to have a documentation that does not contradict itself?

Also documented there:

The effect of a stored routine or trigger upon the value of LAST_INSERT_ID() that is seen by following statements depends on the kind of routine:

  • If a stored procedure executes statements that change the value of LAST_INSERT_ID(), the changed value is seen by statements that follow the procedure call.
  • For stored functions and triggers that change the value, the value is restored when the function or trigger ends, so following statements will not see a changed value.

So changes to the last ID value are persistent from stored procedures, but not from stored functions and triggers.

https://dev.mysql.com/doc/refman/5.7/en/mysql-insert-id.html, linked from the first document, documents once again that the first ID is stored, not the last one:

In the case of a multiple-row INSERT statement, mysql_insert_id() returns the first automatically generated AUTO_INCREMENT value that was successfully inserted.

But only a few lines later, the something very different is documented:

The return value varies depending on the statement used. When called after an INSERT statement:

If there is an AUTO_INCREMENT column in the table, and there were some explicit values for this column that were successfully inserted into the table, return the last of the explicit values.

When called after an INSERT ... ON DUPLICATE KEY UPDATE statement:

If there is an AUTO_INCREMENT column in the table and there were some explicit successfully inserted values or some updated values, return the last of the inserted or updated values.

So the last-insert-id value is overwritten by explicit values, according to that page.

I originally wanted to read the two other pages linked from the first document, but I'm getting sick of MySQL.


Just for fun, let's have a look at PostgreSQL:

PostgreSQL has the pseudo-types SERIAL and BIGSERIAL that really are INTEGER and BIGINT, combined with an implicit NOT NULL constraint and a DEFAULT clause referencing to an automatically created sequence: https://www.postgresql.org/docs/9.1/static/datatype-numeric.html#DATATYPE-SERIAL (This is a link to an older 9.x series documentation, but I know that mechanism is available since at least the 8.x series).

Getting the last inserted value is easy: Either explicitly query the sequence for the current value inside a transaction, or even shorter, use the extended INSERT syntax INSERT INTO table .... RETURNING idcolumn, as documented in https://www.postgresql.org/docs/9.1/static/sql-insert.html.

DBD::Pg implements last_insert_id() by querying the database for the sequence name and its current value.

The INSERT ... RETURNING way is unfortunately not portable, but it is the cleanest aproach to return an automatically generated ID that I know.


A quick look at Oracle. The documentation for last_insert_id in DBD::Oracle is very clear: Oracle does not have anything like SERIAL or AUTO_INCREMENT. You need to manually use a sequence. You can read the sequence value from within the INSERT statement, and, similar to PostgreSQL, you can extend the INSERT statement to return the inserted value (using RETURNING). An other way to implement an automatic IDs is to create a trigger on the table that overwrites the ID column with a sequence value. This was the way that I learned while learning to use Oracle. Later, I wrote (perl) code to first fetch the next value from a sequence, then use that value for an INSERT statement. That way, I knew the inserted ID even before actually inserting it. I'm quite sure that the INSERT ... RETURNING statement was not in the DBD::Oracle documentation 15 years ago, or else I would have used that instead.


MS SQL Server uses INTEGER IDENTITY to create automatic IDs, you have to read them using SELECT @@IDENTITY. Triggers may mess with the returned value. (https://docs.microsoft.com/en-us/sql/t-sql/functions/identity-transact-sql) Overall, quite similar to MySQL.


Update:

erix suggested to change the PostgreSQL documentation links from 9.1 to current, i.e. https://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL and https://www.postgresql.org/docs/current/static/sql-insert.html. Well, yes, it could be useful, but my intention was to document old behaviour, not current, even if both are currently equal (at 9.6). So I linked to 9.1, and the links will stay that way. Maybe I should have linked to 8.2 instead, so here you are: https://www.postgresql.org/docs/8.2/static/datatype-numeric.html#DATATYPE-SERIAL and https://www.postgresql.org/docs/8.2/static/sql-insert.html

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^5: SELECT LAST_INSERT_ID does not work
by chacham (Prior) on Jul 01, 2017 at 19:03 UTC

    Wow, nice collection. Thank you, it was a good read. It's posts like these that i wish i could ++ twice.

    I have always, in by database bigotry, hated mysql. It does not even attempt to be correct. And now, as you have shown, documentationally consistent. Oracle's RETURNING clause is simply the best way to do it.

      Oracle's RETURNING clause is simply the best way to do it.

      I think PostgreSQL's RETURNING clause is far, far better.

      Oracle's way forces you to read the values into variables that you also have to define beforehand:

      (Oracle12 docs:) "For each expression in the RETURNING list, you must specify a corresponding type-compatible PL/SQL variable or host variable in the INTO list."

      Yuck, that's awful. That's really not handy.

      It seems the objections mentioned in this old SO thread are still valid.

      By the way, postgres has had INSERT ... RETURNING since v8.2 (2006). I rather suspect that Oracle's kludgy PL/SQL-based imitation was bolted on after that, and because of it. (Well, maybe 'kludgy' is too harsh. But the above disadvantages certainly make it imperfect compared with Postgres.)

      INSERT - Oracle

      INSERT - PostgreSQL

        By the way, postgres has had INSERT ... RETURNING since v8.2 (2006). I rather suspect that Oracle's kludgy PL/SQL-based imitation was bolted on after that, and because of it. (Well, maybe 'kludgy' is too harsh. But the above disadvantages certainly make it imperfect compared with Postgres.)

        Well, let's have a look.

        Oracle has documented everything and the kitchen sink for ages, and the answer to almost any Oracle-related question, including far more than you ever wanted to know about Larry Ellison's coffee mug, could simply be RTFM. After about 15 sec with google and some hops through Oracle's website, I found the 8.0.4 documentation CD (also available as a ZIP file), and from there, the INSERT statement documentation. The documentation has a "copyright 1997" statement, matching the information found here and here.

        Now guess what's in the INSERT statement documentation:

        The RETURNING Clause

        An INSERT statement with a RETURNING clause retrieves the rows inserted and stores them in PL/SQL variables or bind variables. Using a RETURNING clause in INSERT statements with a VALUES clause enables you to return column expressions, ROWIDs, and REFs and store them in output bind variables. You can also use INSERT with a RETURNING clause for views with single base tables.

        That was in 1997, and around that time, I had my first contact with Oracle.

        Now, according to Wikipeda, PostgreSQL was at that time somewhere between 6.0 and 6.2. I could not find PostgreSQL documentation from that time, but hopping across the versions starting at 7.1 shows no traces of RETURNING until 8.2. And 8.2 was released at the end of 2006, at least 9 years after Oracle had documented RETURNING.

        So no, sorry, Oracle did not copy that from PostgreSQL. It looks like PostgreSQL copied the idea from Oracle.

        Note: Not the implementation! PostgreSQL works like a simple SELECT, whereas Oracle requires explicit use of bind variables. (Update:) Both ways have their advantages: The PostgreSQL way of RETURNING mutates the INSERT into something that behaves like a SELECT with (big) side effects. This allows easy fetching of the return value, but breaks the assumtion found in many programs and libraries that only SELECT statements return data, and especially INSERT doesn't. Oracle's way keeps INSERT a strictly non-SELECT statement and uses only bind parameters to return data, in a way that would also work with PL/SQL code.

        And I wish I had read the fine manual a little bit more 15 years ago, that would have simplified my software quite a bit. On the other hand, my software had to support MS SQL, and that sucked (and still sucks) so much that it would not have made a big difference.


        Update:

        Oracle also has documentation for 7.3.4 for browsing and download. Browsing that shows that INSERT had no RETURNING in 7.3.4.

        Alexander

        Yet another update: Typos fixed, thanks to AnomalousMonk.

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        I think Oracle's version is "correct" because it completely separates what is returned into variables. Those variables are defined by the caling environment like any variables are. I do not see that as kludgy at all. Perhaps it is "pure sql" because it requires the use of a caller.

        On the other hand, i have never used PostreSQL, i just consider it as being similar to Oracle in (most) practice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1193953]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (7)
As of 2024-03-28 16:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found