blogical has asked for the wisdom of the Perl Monks concerning the following question:
An open query:
I'm looking for resources and tips concerning the wisdom and best practices of handling data (and possibly preserving) coming out of a DB. Specifically, on my site I'm using mySQL as the definitive source for info that I'm accessing through an object representing a site user. When I ask for info about the user (say $user->get('zipcode')) I first check if the data is currently defined in the instance. If not, I check the session cache (currently Cache::Filecache) and finally, if it hasn't been found, I query the DB. I store any retrieved info in the instance, then the DESTROY method caches any instance data for later use in this session.
I'm also pre-loading any other relevant data from any called table in anticipation of it being requested (i.e., if I ask for the 'zipcode' I might also ask for the 'street' and 'town', which are on the same record, so get them all on the same DB call.) My records are fairly small, so this shouldn't eat up much memory that wouldn't have been used, and at best avoids several DB calls.
Or so my line of thought goes... but this is all fairly new to me. I can test performance now for an accurate current answer, but I expect the environment I'm running this in to change several times from here to later production use. I thought I'd see if any fellow monks might offer up insight into this area (or share their favorite resource links) so I can anticipate what the best practice might look like. Thanks!
Re: To cache or not to cache
by perrin (Chancellor) on Mar 13, 2006 at 18:39 UTC
|
Cache::Filecache is often slower than MySQL. I suggest you switch to Cache::FastMmap instead. It's one of the only things around that are faster than a MySQL query. | [reply] |
|
I´ve writen a fully functional persistent framework which was finished just some weeks ago. My experience was:
1. I tried Class::DBI and Tangram first, but they were too poor in resources for the project we were creating.
2. Decided to create my own persistence class. Started simple, finished complex (3000 lines of intensive programing).
3. I´ve found the need to cache all schema definitions, permissions, user info, etc. Choosed MLDBM with Storable and BerkeleyDB, just becouse it is faster than anything else (Cache::*, MySQL, etc). There iss no cache os table data in the system, but only system/schema definitions.
4. My persistence class loads up all columns of the row into memory executing a "SELECT *", with no need to access Postgres latter. This data is stored in the object properties. When the object is destroyed, all data is destroied and must be instantiated again with a new SQL statement in the next page loaded (it´s a web app). But the schema information (column names, types, sizes, etc.) is cached.
5. The performance is quite nice.
Diego de Lima
| [reply] |
|
I think you might have been happy with Rose::DB::Object. It's a more complete and faster system than the others.
| [reply] |
Re: To cache or not to cache
by Tanktalus (Canon) on Mar 13, 2006 at 18:36 UTC
|
From what I've gleaned from perlmonks, if faced with a similar question, I would:
- Pay close attention to Class::DBI. You can set it up to load the "bulk" of a user at once, but leave larger, less-often-used columns out until you need them.
- Tune my database to do the caching for me. Databases are good at this. Well, if they aren't, they should be (try a different database). And, at the very least, they're probably better than I am. Especially once I start scaling. A small, single-machine server is one thing. But if it grows such that a single machine is used solely for the db, and then there are multiple webservers, having the cache done by the db will probably save you time (both in development and at runtime). And then, if you grow further such that you need multiple machines to act as your server, again, I'm going to guess that the database cache will be even more effective for this. Also, the database can take care of concurrency issues, rather than having to worry about stale data sitting in your local cache.
But that's just me. I worry about the unlikely issues of scalability because I'm a dreamer ;-) (That, and if I got a job where it mattered, I'd already have a bit of experience thinking about it.)
| [reply] |
Re: To cache or not to cache
by philcrow (Priest) on Mar 13, 2006 at 16:36 UTC
|
We usually start without caching, then add it for pieces that seem slow.
Phil | [reply] |
|
That's something I insist upon when I'm da boss: Don't optimize until you know where the time is spent.
I just recently brought a long-running process down from 2+ hours per invocation to just over five minutes per, using aggressive caching (and lots of RAM!) but I would have guessed wrong had I not profiled first. It turned out that the chunk I thought would be the biggest CPU hog was actually number six on the list, and not worth fooling with.
| [reply] |
Re: To cache or not to cache
by kwaping (Priest) on Mar 13, 2006 at 17:41 UTC
|
| [reply] |
|
| [reply] |
|
|