http://qs321.pair.com?node_id=612731


in reply to Re: Perl solution for storage of large number of small files
in thread Perl solution for storage of large number of small files

Actually I am thinking about completely switching over to the filesystem-only approach and stop toying around with this data-buckets idea. BTW: What is the maximum number of files on my ext3?

I've got a client-server architecture of scripts here - no emails, no imap server... It's a research project. (But challenges seem similar - thanks for the hint!)
A data-generation script gathers measuring data and produces the 40K-120K pakets of output, while a second script takes this output and makes sense of it thus enriching the meta-data index. Both scripts are controlled by a single handler which keeps the meta-data index and stores the data-pakets (enabling us to run everything in clusters). And that handler is where the bottleneck is. So I am thinking about taking off the storage part from the handler and let the data gatherer write to disk directly via NFS.

NFS was also the solution in the "larger files is quicker" paradox. My development machine tied a hash via NFS which resulted in this. Now, actually running the script on the server told me that the tie is always fast. The insert is fast most of the time (although every few cycles, when DB_File expands or so, it slows down..). But the untie takes forever on growing files...

The expected access pattern is mostly plain storage (gatherer), followed by one read on every stored paket (sense-maker). Then every few days an update/rewrite on all pakets involving possible resizing(gatherer again).

The "new toy" idea is now to use a set of disks tied together via NFS(distributed) or LVM(locally), mounted on subdirs building a tree of storage space leaves (replacing my few-files approach).