Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.

Solution, it seems.

by suaveant (Parson)
on Oct 01, 2008 at 19:14 UTC ( [id://714876]=note: print w/replies, xml ) Need Help??

in reply to What the flock!? Concurrency issues in file writing.

I think I found the solution, but it makes me wonder if I am doing something wacky or if I found a bug in the Perl or Linux (most likely Linux) IO system....

If I use O_SYNC and O_APPEND on the sysopen with flock my problems seem to go away... but I need both, along with O_CREAT and O_WRONLY... seems very wrong, but so far my tests look good.

                - Ant
                - Some of my best work - (1 2 3)

Replies are listed 'Best First'.
Re: Solution, it seems.
by dragonchild (Archbishop) on Oct 01, 2008 at 20:28 UTC
    DBM::Deep won't do for your needs?

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I am keeping away from databased solutions for now for a couple reasons: space (current report is over 100 million records); speed (I've had bad luck with db stuff slowing me down because of the added overhead that I don't really need); compatibility (eventually I want to have at least some of this be C, for increased performance, which makes DBM::Deep a bad choice); brevity (for the moment I am changing as little as I can to get this out quickly, we have a client waiting, more changes mean more possible points of failure without extra testing)

      In the future I may look into some improvements, for now I want it to work. (not to mention its driving me nuts not knowing what is screwed up :)

                      - Ant
                      - Some of my best work - (1 2 3)

        Space, speed, and compatibility would be exactly the reasons why I would go with a database solution. Ok, maybe not DBM::Deep, but using DB2 or Oracle or postgreSQL, or even mySQL, would get you most of this. (DB2 even has compressed tables which can do even more for both space and speed. I assume other dbms's have something similar, but admit to a lack of experience there.) A db allows you to partition off the space without respect to your application (separately twiddlable configuration), is already written in C/C++ for speed (compressed tables should be even faster - moving some data from the disk to the CPU which is faster), and is compatible with Perl, C/C++, Ruby, PHP, .NET, Java, and almost anything else you'd care to think about (I'm presuming that "shell scripting" isn't one you'd care to think about).

        As for brevity: the first time I went to a db was really for this type of scenario (but smaller). I had a bunch of data (~20k records) that I needed to track, and could receive requests/updates from multiple users simultaneously (and when you're talking about a smaller data set, the chance of overlap is increased). So I started to investigate flock and all that was required to ensure no data corruption. After a bit of time playing with it (say, about an hour), I gave up and decided that I was wasting my time. I came to the realisation that I'm simply Not That Smart. Who was, I wondered? Well, I figured guys who wrote RDBMS's have already figured this detail out. So I switched the entire thing over to DB2, and had something working in a matter of days. "But I don't have days!" you may say. Well, I say that this was my first database. I knew no SQL. By the end of it, I still couldn't pass your average "Introduction to Relational Database Systems" university course, but I had an app that didn't lose data. And I could produce many simple reports. And others who did know more than me could produce some more extensive reports without necessarily knowing a lick of perl. I'm assuming here that you know more SQL than I did, and then it shouldn't take nearly as long to do.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://714876]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (2)
As of 2024-04-21 20:46 GMT
Find Nodes?
    Voting Booth?

    No recent polls found