Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Append to a busy flat-file db without leaving customer in lurch

by davebaker (Pilgrim)
on Jan 22, 2006 at 21:33 UTC ( [id://524840]=perlquestion: print w/replies, xml ) Need Help??

davebaker has asked for the wisdom of the Perl Monks concerning the following question:

Hello, monks! I have built much of my living around the help wanted ads on my web site. Hence it is important that an incoming help wanted ad actually get recorded into the flat-file database I'm using.

Most of the time life is good. Alas, sometimes the database seems to be too busy to allow the incoming data to be written, meaning the web server times out after the employer/customer has clicked the "submit" button.

Clicking the submit button basically runs this code in a Perl script:

open my $dbh, ">>", $database_file_w_path or die "Trouble opening $database_file_w_path, stopped: $!"; flock $dbh, LOCK_EX or die "Can't get LOCK_EX flock for $database_file_w_path, stopped +: $!"; print {$dbh} "$database_row\n"; close $dbh or die "Trouble closing $database_file_w_path, stopped: $!";

I'm wondering if anybody can suggest improvements. When the flock fails and the script dies, currently the customer just sees a blank screen. I can modify the script to spit out some HTML to notify the customer that the site is too busy, but that's not too much improvement in terms of getting money on the table. Does anybody use a loop after such a flock fails initially, to try a second or third time and notify the customer that the order is still being processed?

Perhaps the only answer is to go to a relational DB?

I have tried to get the database "out of the way" on the read side, by having the other script (which reads and displays the content of particular ads) open the database, slurp all the lines into @lines, and then close the database "right away," rather than having a big loop in the fashion of "while my $line=<$file_handle> {"

Ya think that makes a difference in preventing flock failures on the write-side?

Many thanks!

Replies are listed 'Best First'.
Re: Append to a busy flat-file db without leaving customer in lurch
by Tanktalus (Canon) on Jan 22, 2006 at 22:03 UTC

    Honestly, I would highly suggest an ACID-compliant db for anything where things are critical. Flat-files probably can be made ACID-compliant. But I wouldn't want to write the code to handle it.

    Next best thing is MySQL (I realise that the default db isn't ACID-compliant, but I think there's an option to create a different type of db that is). Or PostgreSQL. Or ... (before anyone complains I'm anti-their-favourite-free-db, I'll say that the only one I have personal experience with is still IBM DB2).

    I started writing a website for work once where I was trying to figure out exactly the same problem you're having: worrying about concurrent updates to my data storage. My manager explicitly told me not to use a database, worrying about the overhead. I ignored him once I got to the concurrency issue. I switched everything over to DB2. It saved my sanity (or what's left of it). Not that it was DB2 per se, but that it was an ACID-compliant server which took care of these nitty gritties and that I pretty much didn't have to worry about them anymore.

    I can never recommend enough the use of the right tool for the job. And storing data is generally best left to database software. Those guys are way smarter than I am about these things, and, though I may be arrogant, I'm not that arrogant - not enough to think I can do in perl in a matter of hours or days what it took them years and years to do in C. (Perl is faster than C for development - but not that faster ;->)

Re: Append to a busy flat-file db without leaving customer in lurch
by ambrus (Abbot) on Jan 22, 2006 at 22:11 UTC

    You could try creating muliple flat files.

    When a request arrives, you only open and lock one of the flat database files, either in a random manner, or by some deterministic way (such as the first letter of the customer). Of course, you then have to modify the programs reading the files to read all of them.

Re: Append to a busy flat-file db without leaving customer in lurch
by TedPride (Priest) on Jan 22, 2006 at 22:57 UTC
    Do you need the ads to be displayed in real time? Like ambrus says, you could append ads to a randomly selected file in a certain range, like newads1.dat through newads10.dat, then set up a cron tab to run a utility every so often that goes through the files, merges the records, appends them to the master file, and deletes the newads files (or renames and moves them, just in case). This would eliminate the flock problem (just increase the range on the file numbers until you no longer are losing ads), and would also prevent you having to open each of the separate files every time you wanted to display ads. Even the cron job wouldn't halt your ads, since you wouldn't need all the files locked simultaneously, just the one you were currently loading.

    I hope the ads don't need to be editable, once submitted?

      Yikes, I had forgotten about the extra amount of locking that is generated by users who are editing their entries... I do in fact allow them to edit most of the text of the ad.
Re: Append to a busy flat-file db without leaving customer in lurch
by hubb0r (Pilgrim) on Jan 23, 2006 at 04:24 UTC

    I'll second or even third the recommendations put forth so far, and suggest going to an RDBMS. The extra effort involved in writing an SQL query is definitely worth the effort, both in reliability, ease of use, and most of all user experience.

    For simple things like what you are doing, I'm sure that some of the DB Abstraction modules will do everything you need and more.

    I used for the first time SQLite.pm the other day for a quick and dirty keyed table for fast insert and lookups, and it is blazingly fast, fully ACID compliant (I think?) and best of all is all in one self-contained module.

    It's definitely worth trying out, and easier for small projects than setting up a mysql/postrges instance to deal with one small transaction.

      i too recomend SQLite, a complete (SQL92 compliant) database stored in a single disk file. i use it in projects where a full RDBMS server would be overkill (i.e. most of my work :) and network acces to the DB is not required.

      SQLite was the first RDBMS i used and my first introduction to SQL. i'd say it is relatively easy to learn. you definetly want to use version 3 with the excellent Perl module DBD::SQLite.

      your first concern might be the integrity of data. i didn't have any problems with that using SQLite, but you might want to read about how SQLite3 takes care of this problem.

      :)))))
        I started working with SQLite lately, and found that it very nicely fits it's niche and is available for lots of platforms. However I find the naming of the perl module confusing.

        The name for the module that was compatible with version 2 of the database is DBD::SQLite2, while the name for version 3 is DBD::SQLite and not DBD::SQLite3 as one would suspect. It took me a while to hit on that out so I thought I point it out.


        holli, /regexed monk/
Re: Append to a busy flat-file db without leaving customer in lurch
by Fletch (Bishop) on Jan 23, 2006 at 00:33 UTC

    Seconding the RDBMS recommendations, but if your heart's set on flat files and you can swing it you might look at a scheme akin to the way maildir does things.

Re: Append to a busy flat-file db without leaving customer in lurch
by chrism01 (Friar) on Jan 22, 2006 at 23:49 UTC
    If you really want to stick to flat files, how about using a tied hash and leaving the file permanently open?
    OTOH, if it's getting that busy (successful :-) ), go for a DB eg MySQL using the InnoDB engine; ensure ver is at least 4.1
    Cheers
    Chris
Re: Append to a busy flat-file db without leaving customer in lurch
by bluto (Curate) on Jan 23, 2006 at 16:37 UTC
    Some suggestions in case you want to use flat files...

    Find out why flock is failing. I'm pretty sure on most platforms that it blocks until the lock occurs (unless you use LOCK_NB). What is '$!' when it dies?

    If you don't need updates to show up immediately in the database, defer them. For example, write your output to a separate file and continue on. Then have a single process periodically come along and read these other files and append to the database.

    Depending on your platform and filesystem (check your docs), you *may* be able to get away with no lock at all while opening a file in append mode. Some OSs arbitrate appenders for you if you are writing a single line of text since it's a very common operation (esp for log files). FWIW, I personally wouldn't do this since then the code relies on a specific setup and it is hard to test this to make sure it's working, but YMMV. Update: After thinking about this, it probably also depends on you printing only small, single lines of text at a time. Another reason to avoid it.

    Since you care that the 'print' works, check the error code for it. If you run out of disk space, your code can silently fail.

Re: Append to a busy flat-file db without leaving customer in lurch
by glasswalk3r (Friar) on Jan 23, 2006 at 12:02 UTC

    You should take a look at the TDB project, but I'm afraid to tell you that the last time I tried the Perl module to access TDB databases it just didn't compile.

    Alceu Rodrigues de Freitas Junior
    ---------------------------------
    "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://524840]
Approved by Corion
Front-paged by Anneq
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2024-04-18 13:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found