http://qs321.pair.com?node_id=188530

uzzikie has asked for the wisdom of the Perl Monks concerning the following question:

i read with interest the recent node on caching using mod_perl...

but are there any modules that can read a cgi file, write the contents to a html file and display the html copy instead? Once it expires, it reads the cgi file again and writes the new contents into another html file...this will drastically reduce the server's workload.

Replies are listed 'Best First'.
Re: Caching Web Pages
by Abigail-II (Bishop) on Aug 08, 2002 at 09:44 UTC
    If you don't need dynamic pages, you should just make static pages. Don't use CGI - just use cron or some other sceduling mechanism to replace "expired" static pages with new ones.

    Abigail

Re: Caching Web Pages
by tadman (Prior) on Aug 08, 2002 at 10:17 UTC
    One trick I've used is to hide the actual innards of the CGI system using something like mod_rewrite under Apache. This can blend parameters into the actual URL transparently, such as:
    http://www.monkstore.com/product/412340/featureXYZ
    This might actually be expanded using mod_rewrite into:
    http://www.monkstore.com/product.cgi?part=412340&mode=featureXYZ
    These re-mapped URLs don't look like CGI output, so they will be cached better. By "better" I merely mean that they look more like regular content and less like CGI output. In other words, to the end-user, they can't tell it's a CGI from the URL alone.

    To fully effect this, you have to tweak some headers so that the page can be cached. I think this is the "expires" header, such as:
    my $q = CGI->new(); print $q->header(-expires => '+9 days');
    Or whatever you feel is an appropriate expiry date. This will probably require a bit of futzing to get right, especially in the URL department.

    The final step would be to layer in something like Squid Cache on top of your Web server to actually do the caching. There's plenty of examples on how to do that, though, so when you get that far, it should be pretty straightforward.

    The best part of this approach is you get to decide what's cached, and for how long. Every page is generated using the same interface, as well.
Re: Caching Web Pages
by Dog and Pony (Priest) on Aug 08, 2002 at 09:06 UTC
    Well, of course it is possible to do, but you have to think about some things...
    • If you are using a CGI program that takes any kinds of parameters, you will need to cache a copy for each of these.
    • How are you gonna determine when a change should occur? By an arbitrarily chosen timespan, or by detecting changes somewhere else?

    If you have pages that are generated from say a database or some similar, and doesn't change all that frequently - by manual or timed updates, I'd suggest that you instead produce a new set of HTML pages from your DB (or what it is) upon every change instead. That is a lot more simple approach to reducing load. This assumes that your CGI's don't take any parameters etc, in which case a caching approach is so-so anyways.


    You have moved into a dark place.
    It is pitch black. You are likely to be eaten by a grue.
Re: Caching Web Pages
by perrin (Chancellor) on Aug 08, 2002 at 12:03 UTC