Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Multiple write locking for BerkeleyDB

by samtregar (Abbot)
on Apr 23, 2008 at 20:19 UTC ( #682487=note: print w/replies, xml ) Need Help??


in reply to Re: Multiple write locking for BerkeleyDB
in thread Multiple write locking for BerkeleyDB

Wow, that's a lot of bad advice for one message!

I'm amazed to see anyone offering SysV IPC as a solution these days - it was a bad choice years ago when I last used it. So many limitations, so many traps, so much pain, so DAMN SLOW!

I have no idea what mmap would gain you here aside from shared storage. This is clearly a database problem, so you might as well use one. Most likely BDB and MySQL are both using mmap for you.

And for god's sake, don't write your own DB in C!

-sam

  • Comment on Re^2: Multiple write locking for BerkeleyDB

Replies are listed 'Best First'.
Re^3: Multiple write locking for BerkeleyDB
by sgifford (Prior) on Apr 23, 2008 at 21:43 UTC
    Those are quite strong words for a post that includes no benchmarks. I put together three small test programs, one using MySQL, another using SysV semaphores and shared memory, and another using mmap and a gcc-specific atomic_add operation. On the machine where I ran the code MySQL could increment a counter in a MyISAM table about 2200 times/second, or in a memory table about 3500 times/second. Using SysV IPC, I could increment about 15,540 times/second, not quite 5 times faster than the memory table. Using mmap and system-specific atomic locking instructions from C++, I can increment about 9.6 million times/second, which is about 2700 times faster than a MySQL memory table. With 3 writers, MySQL and SysV take about 3 times as long for all 3 to finish, so they are serialized but not penalized too badly for the lock contention; the mmap+atomic_add version actually gets faster (12.2M times/second), because it can run on two processors at the same time. So there are definitely performance advantages to doing a bit of the work yourself.

    Now, if the OP's question hadn't been about performance, that probably wouldn't matter; you're right that SysV IPC is rarely used, and the MySQL code is much easier to understand and maintain. But his question was in fact about performance, and in a later post dino states specifically that the performance of MySQL was not fast enough, so advising him to use it is particularly unhelpful. Also, there are certainly more modern forms of IPC, but none that have builtin support in Perl.

    Here is the code I used. If you have anything faster, please post it along with benchmarks.

    Update: Fixed some errors in benchmarks (there was no row in MySQL, so the UPDATE statements weren't doing anything. Added another test with mmap+atomic_add. Fixed a typo.

      Holy cow, that's a lot of code for $counter++! I hate SysV IPC so, so much.

      Is it really so fast though? MySQL is storing that data on disk and it's only 4x slower! Is disk 4x slower than memory? No, it's much, much slower. SysV. IPC. Sucks.

      If you feel like running more benchmarks for me, change the MySQL one to use a MEMORY table. Then at least you won't have disk writes dragging MySQL down.

      -sam

        It really isn't any more complex, it's just that DBI and mysql wrap most of that up for you. Any synchronized solution will have to take some kind of mutex, read and write the counter, then release the mutex, which is all my code does. If you were to use strace or gdb to step through the code involved for both cases (including the mysql server), you'd find that SysV is much simpler; it's just that you have to do more of the coding yourself, because it's not as widely used.

        As far as memory tables, I had an error in my benchmark, I was running it with no rows in the table, so no updates were happening at all. I have corrected it in the original post, and also tried a MySQL memory table; SysV is about 4.7 times faster than a memory table. I also added a benchmark using mmap and an atomic add, which is much, much faster than any of the other solutions.

        And as to the amount of work MySQL and SysV are doing behind the scenes, I don't see how it matters much unless the OP needs every update written immediately to disk. Otherwise MySQL is just doing extra work that the OP doesn't need.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://682487]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2023-09-25 10:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?