Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Build a BDB cache quicklier

by bennymack (Pilgrim)
on Dec 28, 2006 at 13:35 UTC ( [id://592038]=perlquestion: print w/replies, xml ) Need Help??

bennymack has asked for the wisdom of the Perl Monks concerning the following question:

Dear Esteemed Monks,

I am looking for a way to create a BerkeleyDB cache file more quickly.

For example, I have a ~70MB cache with ~500_000 keys. On a machine with quick disk access it will build in about a minute but on a slow disk it can take quite a while.

I'm building the cache from a a Storable file FYI. I just want to turn the stored data structure into a BDB cache file as quickly as possible.

Is there possibly a way to build the entire cache in memory to avoid intermediate disk writes, then write the whole thing out to disk just once when it's done? Or any other solution would be fine.

Thanks in advance!

Replies are listed 'Best First'.
Re: Build a BDB cache quicklier
by jettero (Monsignor) on Dec 28, 2006 at 13:42 UTC

    I'm under the impression that DB_File handles all the IO in efficient buffered way. So, I believe the complete answer to your question is to tie a hash to a DB_File and it will automagically handle efficiency. I have had problems with long term storage in Storable files and find that DB_File is also much more dependable (by comparison).

    -Paul

Re: Build a BDB cache quicklier
by SheridanCat (Pilgrim) on Dec 28, 2006 at 13:42 UTC
    This does not answer the question asked, but you did ask for other solutions.

    If you're just building a cache and don't need it to write it to disk, have you looked at memcached? It's simple to setup and has Perl bindings.

      It's also slower than BDB and would start out empty each time the program starts. Probably not the right solution here.
Re: Build a BDB cache quicklier
by bennymack (Pilgrim) on Dec 28, 2006 at 16:06 UTC

    Thanks for your replies.

    After a little brainstorming I decided to try building the cache in '/dev/shm' and then move it to harddisk once it was complete. This had the ever so pleasing affect of reducing the build time from 15minutes to 15 seconds! It also reduced the size of the built out cache for some reason. Took it from ~70MB to ~38MB?! I Can't even explain that but I'll take it.

    Anyway, I thought I would post this here for posterity in case anyone else needs this optimization in the future or if anyone would like to further discuss this method.

      You could have also tried modifying the memory/cache limits for BerkeleyDB

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://592038]
Approved by jettero
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-20 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found