Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Concurrency control in web applications

by cbraga (Pilgrim)
on Oct 24, 2003 at 16:19 UTC ( [id://301887]=perlmeditation: print w/replies, xml ) Need Help??

A problem which often arises when working with database applications is that of concurrency control, IOW, preventing race conditions such as the classic: you load a user's info page, someone else loads it, you change his name and save it, the other person changes his telephone and save it, losing the change you made.

"Traditional" two-tier applications which run the app logic and interface on the user's computer and use a database for storage had an easy way out: open a transaction when the user's record is opened and keep it locked for updates. However, in three and four tier web applications that is not feasible. A few alternatives arise:

— Keep track of records being edited with an "in use" field recording who's editing it. Problems arise in cases the editor goes away or waits too long to unlock.

— Add a version counter to each record, incremented in each update, and have your update fail (... AND vcounter=10 ...) if someone else changed it while you weren't looking. Simple and effective.

— Keep a change history in the form of a table containing records of all changes to all tables for a given time frame, and when a race condition is detected then reconcile the changes so that none's lost. Best approach, but requires a lot of work.

So, those are my ideas. Anyone got another approach?

P.S.: Saying that you should only update fields that actually changed, while a valid approach, doesn't count since it doesn't solve the problem. Two people could be updating a credit or stock total, for instance.

<sig>ESC[78;89;13p ESC[110;121;13p</sig>
  • Comment on Concurrency control in web applications

Replies are listed 'Best First'.
Re: Concurrency control in web applications
by perrin (Chancellor) on Oct 24, 2003 at 17:39 UTC
    Most web apps don't bother trying to solve this problem because it doesn't matter to them. If a user opens two windows to edit his own phone number and saves in both of them, he probably expects the last save to win, and that's what happens. In the rare case where this does matter, people generally use a couple of standard approaches. The first is optomistic locking, which your version counter is an example of. (You can also use the previous state as a WHERE clause in your update, which accomplishes the same thing without needing the counter field.) The second is pessimistic locking, which your "in use" field is an example of. I've never heard of any other approach that didn't sound like overkill.
Re: Concurrency control in web applications
by hossman (Prior) on Oct 24, 2003 at 17:51 UTC
    — Keep track of records being edited with an "in use" field recording who's editing it. Problems arise in cases the editor goes away or waits too long to unlock.

    This approach works well, but the problems you describe usually arise because you've oversimplified it.

    What i've seen used highly successfully, is to store the following fields for each editable object in your datamodel:

    • lockUser
    • lockDate
    • modUser
    • modDate

    By default, these should all be null. when a user "checks out" an item to edit it, you set the first two values. When a user wants to make an update, you verify that their userId matches "lockUser", and update both modUser and modDate. when they "check in" the object after making their changes, you delete both lockDate and lockUser

    The added power that this has over a simple "in use" field showing you who is currently editing, is that now you can add features in your app to allow people to "break locks" if they are over a certain age, or only if the modDate is more then N minutes ago. (so it doesn't matter if the person who locked it goes home for the night, or gets hit by a bus, other people can steal the lock)

    Of course, lock breaking should be limited by access control.

Re: Concurrency control in web applications
by pg (Canon) on Oct 24, 2003 at 18:47 UTC

    Another simple but useful way is to have an artificial column in the table, which you have locking concern. You may call it something like “mod_count” or whatever makes sense to you. At the beginning of a transaction, you read the mod_count, and remember its value. At the end of the transaction, when it comes to the point you update the database, you lock, and then check the mod_count, see whether it is still the number you got. If the mod_count is different now, fail the transaction (assume you don’t want accidental overwrite), otherwise, do the update, also increase mod_count by 1 in the same update statement.

    By doing this, you don't need to lock the table thru out the transaction, but only briefly at the end.

      Another simple but useful way is to have an artificial column in the table ... call it something like “mod_count”

      I think that's what the OP meant by "Add a version counter to each record," also known as "optomistic locking."

Re: Concurrency control in web applications
by etcshadow (Priest) on Oct 25, 2003 at 02:34 UTC
    First of all, I understand your question, but I'd still say that in most cases it is not worth the bother of addressing. However, it is sometimes worth it, or even necessary.

    What I've done, myself, to address this problem is actually to embed the old values in the form as hidden inputs, and then compare them to what is in the DB on submit. Then prompt the user with a sort of diff3-like opportunity to resolve conflicting edits / approve merged edits. This (mind, this is sort of psuedo-code-like, but I think it conveys the idea):

    • addresses the problem
    • doesn't require any kind of cross-form-submission row locking, which would be a real bad idea
    • avoids the problem of zero-diff changes interfering with meaningful changes
    • puts the onus of resolution on the slow-guy

    It's actually very similar, algorithmicaly, to how revision control software (like cvs) works... but it's not actually using revision control software, as that is actually targetted towards large text files, not database forms. It goes something like this:

    # in the form HTML generating code... my %row = # fetch row from table into a hash foreach my $col (keys %row) { # actually properly escape all of this stuff... I'm just writing it + out sort of short-hand print "<input type=hidden name=original_$col value=$row{$col}>"; } # in the form processing code... assuming that your form values are p +resent in a hash called %form my %row = # fetch row from table into a hash my (%yours,%theirs); # store colums which were updated by you, and col +umns which were updated by some intervening session foreach my $col (map {/^original_(.*)/ ? $1 : ()} keys %form) { if ($form{"original_$col"} ne $row{$col} { # oh no! this column was updated by some intervening session $theirs{$col} = $form{"original_$col"}; } if ($form{$col} ne $row{$col} { # this is a column that *you* edited $yours{$col} = $form{$col}; } } if (!%theirs) { # there were no intervening edits... process the form! DoThing(\%form); } else { # there were edits maid by another user between the time you downlo +aded the form and when you posted the form back print "CONFLICTS!"; # etc let the user know what the deal is }

    And then re-display the form, marking it up appropriately. For example, color the inputs in only %yours as blue, the ones in only %theirs as yellow, and for the ones in both %yours and %theirs... make it green if $yours{$col} eq $theirs{$col} (you made it the same thing... yellow and blue make green!), and make it red if $yours{$col} ne $theirs{$col} (you made conflicting changes... red = bad, warning!). If you want to be super cool... you can even pass the values contained in any textareas through to diff3 (or merge)!

    Seriously... I've done this... it's super cool. :-D


    ------------
    :Wq
    Not an editor command: Wq
Re: Concurrency control in web applications
by dragonchild (Archbishop) on Oct 24, 2003 at 23:51 UTC
    I'm not sure what the problem you're trying to solve is. Let's assume your situation. The mistake is that the page is updating every field. Why isn't it just updating the fields that have changed? It's very simple Javascript to keep track of that, and JS is just the mechanism I've seen used where this was an issue. So, you have the following possibilities:
    • A updates the name and B updates the date. No collision.
    • A and B both update the name. The later one wins.

    So, we've reduced collisions to those fields that are being changed together. The way I've seen this solved is to keep a history of all changes to all business-related fields. I'm not talking about an UPDATED_ON field for the table. I'm talking about a real history table, along the lines of:

    ID TABLE_CHANGED COLUMN_CHANGED ACTION OLD_VALUE NEW_VALUE WHEN BY_WHOM

    Then, you have a history page, where you can see what changed, when, and by whom. This history table solves this problem, and a host of other problems, plus provides capabilities like rollbacks and the like. Now, I wouldn't suggest this solution for all situations, but it definitely seems warranted for things like user information. Remember, not everything should have a technology solution, especially when it's a human problem.

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    ... strings and arrays will suffice. As they are easily available as native data types in any sane language, ... - blokhead, speaking on evolutionary algorithms

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Concurrency control in web applications
by Juerd (Abbot) on Oct 24, 2003 at 16:45 UTC

    The easy way out is to make your program use CVS (or something alike) and let that handle this for you. You end up with a very specific, but fancy front-end and you can market it as "CVS integration" :)

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      CVS might work for plaintext files, but isn't a good idea for use with an DBMS.

      ----
      I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
      -- Schemer

      :(){ :|:&};:

      Note: All code is untested, unless otherwise stated

        CVS might work for plaintext files, but isn't a good idea for use with an DBMS.

        "or something alike" - this includes re-inventing the wheel.

        Note though that combining CVS with a DBMS is possible and not always a bad idea.

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Re: Concurrency control in web applications
by mpeppler (Vicar) on Oct 24, 2003 at 21:37 UTC
    Sybase and MS-SQL have a datatype called "timestamp" (which isn't time related at all) that can be used for this. It is an 8 byte binary column that gets updated each time the row is updated, so you can easily check if the data that the user has been editing was changed when applying the update. This is essentially the same as keeping track of changes with a counter, but happens automatically (and the timestamp column can't be modified manually).

    The alternative is as others have suggested to implement your own locking mechanism via a secondary table. Be sure to include a time when the lock was acquired so that you can force the release of stale locks.

    Michael

      MySQL also has such a field type, but only the first timestamp field in a record that is not updated or has the value NULL assigned, will actually be automatically set. And you can assign it also.

      It seems however that this feature is subject to change, so you might want to check out to make sure. From the MySQL documentation:

      The TIMESTAMP column type has varying properties and behaviour, depending on the MySQL version and the SQL mode the server is running in.

      TIMESTAMP behaviour when running in SAPDB mode
      When MySQL is running in SAPDB mode, TIMESTAMP behaves like DATETIME. No automatic updating of TIMESTAMP columns occurs, as described in the following paragraphs. MySQL can be run in SAPDB mode as of version 4.1.1. See section 4.1.1 mysqld Command-line Options.

      TIMESTAMP behaviour when not running in SAPDB mode
      The TIMESTAMP column type provides a type that you can use to automatically mark INSERT or UPDATE operations with the current date and time. If you have multiple TIMESTAMP columns, only the first one is updated automatically.

      Liz

      In addition to the points liz mentioned timestamp columns only have a one second resolution - so if you do a read/update in a smaller interval you're out of luck.

      Personally I'd just use a counter.

Re: Concurrency control in web applications
by Art_XIV (Hermit) on Oct 24, 2003 at 20:08 UTC

    Why do you consider connection-based transactions to be unfeasable for n-tier applications?

    I've used them for n-tier applications (not Perl-based ones), but this was before I found out that they might be unfeasable. ;)

    Is a real issue with Perl or the DBI modules? Or am I wayyyy out of sync, here?

      In a web application, you can't use transactions across multiple HTTP requests. You have no way of knowing if the user is even still there. And every edit involves at least two requests: one to get the form with the data being edited, and the second to submit the changes. You can only use a transaction later when processing the submitted form, at which point you may be clobbering someone else's edit.
        I once interviewed one developer and he told me about the project he was worked in the past where they handled exactly this problem. They used separate backend daemon process which would maintain persistent database connections each of them tied to individual user session. This way it was no brainer to use transactions and locks across multiple HTTP requests.

        --
        Ilya Martynov, ilya@iponweb.net
        CTO IPonWEB (UK) Ltd
        Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
        Personal website - http://martynov.org

Re: Concurrency control in web applications
by hardburn (Abbot) on Oct 24, 2003 at 18:02 UTC

    Under MySQL, you can use InnoDB tables, which support row-level locking. See http://www.innodb.com/ibman.html#Locking_reads. I'm sure other databases support this, too.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    :(){ :|:&};:

    Note: All code is untested, unless otherwise stated

      In a web application this is no help. You can't just lock the rows whenever you send out an editing form, because when the user submits that form the request might go to a totally different server process, or the user might simply shut down their browser and leave the rows locked forever.
Re: Concurrency control in web applications
by thraxil (Prior) on Oct 29, 2003 at 02:25 UTC

    Red Hat's Enterprise CMS actually takes your third strategy to an almost absurd extreme. it's probably complete overkill for most of us, but kinda cool. :)

    i went to a talk that Rafael Schloming gave on "Versioning Structured Content" where he explained Red Hat's approach. the idea is to basically keep a log of every single event at a very fine level of granularity and to ensure that every low level operation is invertible (ie, for every add, there is a corresponding delete, etc.). then, to revert to an older version, or another branch, you just construct a stream of events that will get you there. this is basically the same as how RDBMSes handle transactions internally. the only difference is that when a transaction is rolled back in an RDBMS, it can dump the log info. in a CMS, you potentially have to store everything as branches.

    his slides are online.

      Gee, I never cease to be amazed by the extremes to which some people take their stuff. Thanks.

      ESC[78;89;13p ESC[110;121;13p

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://301887]
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-25 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found