http://qs321.pair.com?node_id=134152

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

So here's the problem: I have a CGI application for creating web pages which currently stores a representation of the links and data of a web site in a perl hash-of-hashes structure that I write to disk with Data::Dumper (see http://chicodigital.com/webtool.html). This is fine for a dozen medium sized web pages, but at some point, loading the data file for each CGI execution is going to make this app slow. Using mod_perl would solve this (right?), but my target users are either on shared servers (and thus don't have mod_perl) or don't want to deal with setting up a database. I also know I can use Storable.pm as a faster file storage solution, but that still writes to disk. So I started thinking about how it might be possible to retrieve persistent data through a Unix socket.

Here's the idea: Multiple CGI processes all talk to a persistent process that just hands them the perl hash (or accepts an updated hash) through a Unix socket. The process would die after a timeout period. It would be started by the inital CGI request, which would check if the process existed and create one if it didn't. Another way to think of this idea: it's just a home-made database connection, but the data is cached in memory (it doesn't read from disk on every request, only the first one, or after a modify).

Is this possible? (I think it is). Is this just a bad idea? Comments? I'm not avoiding mySQL, but I am catering to users who either don't have access to a database or don't want to set one up.

-alan

Replies are listed 'Best First'.
Re: CGI data from memory w/o mod_perl
by tachyon (Chancellor) on Dec 24, 2001 at 20:20 UTC

    Have you considered DBD::RAM It effectively gives you a SQL/DBI database in memory so does what you want.

    Structurally I would have a CGI module and a separate Data Module. Define the interface between them and you are then free to modify the data handling within the Data Module over time. If you make it DBD::RAM SQL/DBI based moving to a real database is a 1 line change Still charge $10K to do it though....just so your clients know they are getting value :-)

    Modperl is the way to go if speed is the issue. The major overhead with Perl CGIs is starting up a new process rather than disk access from that process although this will of course vary. With modperl you avoid this overhead plus you get data persistence to boot. You should have a look at Apache::Session which usually works with modperl but is also able to work with vanilla CGI (so says the docs)

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: CGI data from memory w/o mod_perl
by lhoward (Vicar) on Dec 24, 2001 at 20:18 UTC
    How about trying an IPC share?

    Use a module like IPC::Shareable to store the data in memory with apropriate "load from disk" code around it (i.e. if the shared memory is empty, then populate it from your disk file). That way it'll rebuild the IPC structure automatically if its ever lost (i.e. reboot, etc...).

Re: CGI data from memory w/o mod_perl
by Hrunting (Pilgrim) on Dec 24, 2001 at 20:00 UTC
    It's not a bad idea, but it's certainly more complicated than it needs to be, and you have to deal with the added complexity of a socket.

    If what you're looking for is faster disk access (and it sounds like you are), setup a in-memory filesystem (for example, on a Solaris box, /tmp is usually just RAM, and files written to it are really being written straight to memory). Essentially, it's a poor man's memory cache, but it will make your file access a LOT faster, and you don't have to change anything except the path location. The OS will manage moving things out of memory and into swap as it needs to.

Re: CGI data from memory w/o mod_perl
by termix (Beadle) on Dec 24, 2001 at 19:48 UTC

    It certainly should be possible (we did something similar for short term account caching).

    If each CGI is loading the complete hash from the central program, then each CGI is going to have a copy of it and will have to deal with the entire hash of hashes. I would recommend optimizing your central program to only give the CGIs the individual pieces of the hash that concern them.

    Shared memory could be another option, although I do admit, I haven't ever tried that myself.

    -- termix

(dkubb) Re: (1) CGI data from memory w/o mod_perl
by dkubb (Deacon) on Dec 24, 2001 at 22:31 UTC

    There is a thread on the mod_perl mailing list where they benchmark the speed of the CPAN modules that allow storage and retrieval of persistent data.

Re: CGI data from memory w/o mod_perl
by alanraetz (Novice) on Dec 24, 2001 at 23:42 UTC
    Thanks! Exactly the info I needed. It looks like IPC::Shareable is what I basically was asking for, and maybe DBD::RAM is what I really need. I was having a hard time believing that this was not a common problem, but searching CPAN for keywords like 'persistent process memory' etc, led me nowhere. I'll have to look inside IPC::Shareable and see how it works... thanks to all who replied.
A reply falls below the community's threshold of quality. You may see it by logging in.