Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Flat Website From Database Website

by Cody Pendant (Prior)
on Nov 06, 2005 at 00:23 UTC ( [id://506043]=perlquestion: print w/replies, xml ) Need Help??

Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

I've been creating a database-backed website, (MySQL/Perl/HTML::Template) but for various reasons, its content is now pretty stable, and doesn't need to be sourced from the DB 99% of the time. Presumably the site will run more quickly if it's just flat pages.

So I'm thinking, I could just grab the whole site, maybe with wget or something, then re-sync the local copy to the website.

Of course, that will be the point where I discover that 23 of the 10,000 pages are wrong. If I hadn't done the above, the solution is to fix the data in the DB and ... nothing. The site is now correct again for the next person who browses it.

So, is there some kind of solution to this problem in Perl, Apache, etc?

The ideal state of the website is flat, until I notice a mistake, at which point I correct the mistake, "un-flatten" the problem files, render them and "re-flatten" them.



($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
=~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: Flat Website From Database Website
by Aristotle (Chancellor) on Nov 06, 2005 at 00:59 UTC

    Reversing the rendering process is always messy and almost as often lossy. Don’t. Keep around the data in structured storage even after you’ve used your templates to output it in rendered form.

    If you set things up right on the webserver’s end, you can then switch horses at any time, even generate half the site dynamically, the other half statically. The templates can be identical – eg. the Template Toolkit tutorial shows the same template being used with tpage, then ttree, then from within a CGI script.

    Just don’t throw away your structured data.

    Makeshifts last the longest.

Re: Flat Website From Database Website
by gloryhack (Deacon) on Nov 06, 2005 at 01:11 UTC
Re: Flat Website From Database Website
by jZed (Prior) on Nov 06, 2005 at 01:51 UTC
    I agree with Aristotle : "don't throw away your structured data". One way I've handled this kind of thing is to have a table with a primary key of unique page names and foreign keys for a template name and a content identifier. Creating a static version of the website consists of traversing the page names, selecting the referenced template and content from their respective tables, and generating a static page. Even a moderatly large site can be generated in a matter of seconds so I would just regenerate the site whenever either the content or the template tables were modified. If you have a very large site and don't want to regenerate the whole thing for each modification, include a "last-modified" column. Another trick I'd use - have the script act as a CGI and generate a dynamic page when invoked via the web and act as a server-script and generate a static page when invoked from the command line.
Re: Flat Website From Database Website
by BrowserUk (Patriarch) on Nov 06, 2005 at 07:09 UTC

    If you want to remove what amounts to "static load" from your DB server, then consider inserting a caching proxy in the loop.

    If the pages are really static, then the load on your DB will drop to near zero very quickly, but you will retain everything to allow for the occasional rare hit, or future changes.

    It's probably an easier transition too.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Flat Website From Database Website
by tinita (Parson) on Nov 06, 2005 at 00:53 UTC
    i have never used squid, but it might be worth a try. if you correct a mistake, just fix it and invalidate the cache of this page.

    update: s/index/cache/

      i have never used squid, but it might be worth a try.

      I get the feeling that would be like swatting flies with a sledgehammer in this case. There's a very nice trick you can do with mod_rewrite which would probably be all the OP needs. It goes something like this (untested):

      RewriteCond %{REQUEST_FILENAME} \.html$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule /?(.*)$ /cgi-bin/generate.pl/$1
      It needs more rules to make it really robust, but that's the general idea. Of course, generate.pl creates the static file and to "invalidate the cache", you just delete the static file and request the page.

      Of course, this assumes Apache... but everybody runs Apache, right? :-)

      -sauoq
      "My two cents aren't worth a dime.";
      
Re: Flat Website From Database Website
by dragonchild (Archbishop) on Nov 06, 2005 at 06:13 UTC
    Have you noticed a performance issue with feeding directly from the DB? Do you have a reason for wanting to remove the DB from the picture? Have you looked at various DB options, such as query_cache and indices?

    In other words, this smells like premature optimization.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Flat Website From Database Website
by EvanCarroll (Chaplain) on Nov 06, 2005 at 02:25 UTC

    You might also want to check out the Apache cache filter or mod_cache. One thing you could do for instance, is set the expiry for mod_cache to 1day, then every day mod_cache will allow Apache to generate one copy of the page, and it will cache it for the duration of the day. This is extremly easy to impliment, and will allow you to keep the parts of your site that require database access dynamic, such as a search feature.

    Mason has faculties to deal with this kind of thing too btw, You might want to read about Mason's cache feature.



    Evan Carroll
    www.EvanCarroll.com
Re: Flat Website From Database Website
by bageler (Hermit) on Nov 06, 2005 at 18:23 UTC
    Funny, they usually pay me to come up with solutions to this. TMTOWTDI. It's generally better to generate the flat files as they are created rather than having a secondary process crawl your site and generate them.
Re: Flat Website From Database Website
by RiotTown (Scribe) on Nov 07, 2005 at 00:09 UTC
    You could write a site creation script that runs either by user actions (updating a table sets a flag, flag causes cron script to re-generate pages; you could even create a mapping table to contain all of the html files that would need to be revised each time a specific table is updated) or have the entire site re-generated during low traffic times from the db.
Re: Flat Website From Database Website
by hossman (Prior) on Nov 09, 2005 at 06:19 UTC

    squid rocks, it's easy to setup and it can scale as much as you may need it to -- but if you don't want to run a seperate daemon (or if your hosting provider won't let you) mod_cache may meet your needs (it's still somewhat experimental however).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://506043]
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2024-04-25 20:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found