Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Simplest CGI Persistence Solution?

by Tortue (Scribe)
on Apr 02, 2001 at 15:01 UTC ( [id://68972]=perlquestion: print w/replies, xml ) Need Help??

Tortue has asked for the wisdom of the Perl Monks concerning the following question:

I'm looking for an ultra-simple way to write a persistent CGI script, which could be used for example to implement a small web forum, or an adress book etc.

I'd like it to work "out of the box" with a recent version of Perl, on most any platform, with any web server, and without needing the installation of any additional Perl modules or any other software (e.g. an RDBMS). Please tell me if this is unWise, or if there's a clever way to do it.

I'm thinking of something very very simple for very small amounts of data, a coarsely persistent CGI program that might run like so :

1. Load all data from a file using perlman:Data::Dumper
2. Manipulate data (consult, insert, update, delete)
3. Save all data to a (locked) file using Data::Dumper
The locking/unlocking mechanism might need to be platform-dependent, since perlfunc:flock doesn't work everywhere.

Why do I want to do such an obviously non-scalable thing? Hopefully this would be in the spirit of quick-and-dirty, portable, concise, and useful Perl programs. I've done plenty of Perl DBI, Java/JDBC, etc., but here I'm looking for something simple and easy.

There didn't seem to be anything relevant in CGI Programming.

Replies are listed 'Best First'.
Re (tilly) 1: Simplest CGI Persistence Solution?
by tilly (Archbishop) on Apr 02, 2001 at 17:50 UTC
    As I commented to you privately, virtually everyone who I have seen try to use file locking has messed it up very seriously. Normally this involves both technical mistakes (explicitly unlocking rather than relying on close to do it for you - a mistake whose severity is lessened in Perl because Perl does a flush for you) and conceptual ones (locking operations rather than tasks).

    Now if you really want to proceed, want it portable, and are willing to aggressively back up your data for the inevitable failures, then the right way to go is to rename the file to a private name, check that you got the file (if not then wait and try again in a bit - you can do fine-grained timeouts with select), manipulate it, and then rename it back.

    This relies on the fact that rename is an atomic operation on virtually every platform and filesystem (so long as you are not moving the file across filesystem boundaries). It opens you to the fact that Some Day your CGI script will die while the file is renamed and you will lose all of your data.

    You did have backups?

    UPDATE
    Corion asked me whether rename is atomic on NFS. I thought it happened atomically even there, but I don't have a reference handy. Can anyone confirm or deny?

    Note that I do not mean that from issuance to action is atomic - that is impossible due to network latency. What I mean is that if 2 processes (possibly on different boxes) both try to rename at the same time, one will succeed, one will fail, and the file will arrive at a new name intact.

Re: Simplest CGI Persistence Solution?
by Trimbach (Curate) on Apr 02, 2001 at 15:49 UTC
    Well, your idea is what I've put together recently. It was very short, very simple, and very basic, which seems to be your main goal. Essentially my solution (which isn't accessible from this computer or I'd post it) uses cookies to store a session id ($session_id = time() . $$), storing Data::Dumper files on the server by that name. It's a little rough, but it works.

    One suggestion: in a "Quick and Dirty" solution you can dispense with flock if your session ids are (virtually) guaranteed to be unique like in my solution. No one's going to ever be writing to the same file at the same time, so no need to lock the file. The only problem with this is that session files will collect on the server... to take care of this I set my CGI to wipe out all session files more than 2 days old every time it's run. You could do the same kind of thing with cron if you wanted.

    The whole thing is like (maybe) 15 lines of code to maintain state. Why not?

    Gary Blackburn
    Trained Killer

      Your solution seems great for persistence across a session, for each user. But I'm looking to have persistence shared by all users. Or perhaps I don't understand your solution (don't kill me!:)

      I want web users to share access to the same data, stored in one file. Ideally the CGI is very fast, so the problem of parallel conflicting updates is reduced.

      I see two dangers:

      1. Damaging the integrity of the file
      2. Over-writing someone else's changes
      I definitely need to avoid (1.), but I could live with (2.) if it's unlikely to happen much in practice.

      I could avoid (2.) entirely if, as was suggested to me, the script locks the file as soon as it reads it and doesn't release it until it's done writing to it. As long as things are fast and nothing goes wrong with the script preventing it from releasing, this should work.

        Ah.... sorry about that. I thought you were asking how to maintain state for individual users, not for things like a bulletin board and such.

        Yeah, if you have a single file that lots of users can write to you'll definitely need some sort of file locking mechanism (either flock or something else.) If you desperately need compatibility with systems that don't support flock you can use a flag system, that is, your script checks for the existence of a "lock" flag (this would be a separate file on the server called something like "lock" with no data in it.) If "lock" exists, the main file is locked and the CGI should not attempt to write to it. If "lock" does exist, the CGI first creates a "lock" file (preventing other instances of the CGI to write to the main file) then writes to the main file, then erases the lock file when everything's completed.

        Again, it's rough and ready, and you'll probably still have people step on each other's entries once in a awhile, but hey, it's better than nothing.

        The best solution would be to use a DB with some sort of locking mechanism built-in (either table level or row level). That way you can write to the DB all day long and not worry about inadvertantly erasing someone else's entry because the DB takes care of this whole issue.

        Gary Blackburn
        Trained Killer

Re: Simplest CGI Persistence Solution?
by mirod (Canon) on Apr 02, 2001 at 17:53 UTC

    Just make sure you do your locking properly, you can have a look at flock - lordy me.... for a lengthy discussion of the various traps that come with using flock.

    Usually I use a dedicated (empty) lock file and then I go about playing with the various data files I use:

    sub db_lock() { open LOCK, ">$LOCK_FILE" || die "$0: [error] Can't open lock file: + $!\n"; flock( LOCK, LOCK_EX) || die "$0: [error] during lock: $!\n"; }
Re: Simplest CGI Persistence Solution?
by gregor42 (Parson) on Apr 02, 2001 at 23:50 UTC

    First off, no I don't work for these people, so this is not a capitalist plug.

    However, I tackled this problem once upon a time & ended up learning how to code Java Servlets instead. However, there is a company that has created the same persistence & scaling mechanism that servlets use, only it works with PERL.

    The name of the company is Velocigen.



    Wait! This isn't a Parachute, this is a Backpack!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://68972]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-04-24 05:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found