http://qs321.pair.com?node_id=710657

leocharre has asked for the wisdom of the Perl Monks concerning the following question:

I have a system that checks if a file exists, if not, the file is created.
Easy enough.
Now, the file count will be at least 100,000 and potentially 3 million in 12 months. every *filename* is a md5 hex sum digest thus, it is 32 chars long and each is 16 possible chars.

Space is not an issue here. These are small text files. I'm on GNU/Linux using ext3 partitions.

I'm considering if I should have a hack to keep the directory file counts to a minumum.
For example, pause does this with http://backpan.perl.org/authors/id/L/LE/LEOCHARRE/, notice L/LE/LEOCHARRE (yes, that is a directory not a file, let's be flexible here).

so if the file in question (that i will read or create) is named 'opuscows', it will really reside in either o/op/opuscows ( or more interestingly.. op/us/cows , then the first dir would have 256 entries, and every level would also have another 256 entries (16*2 for xx/) )

This would help keep my directory entries lower than say, 3 million.

This hack will slow down looking and writing, by a little bit.

But maybe this is not needed. I will not be searching for files, or doing a dir listing operation. The file is there or not.

Is there a limit to how many files I should have in a such a directory? I read that "There is a limit of 31998 sub-directories per one directory,..." - but this does not make mention of files in general.

Please excuse my broken up discussion.

update

After discussion in this thread, I am using mysql to serve the data instead of using a regular filesystem.

I had some text entries that were larger than 1M, this caused a problem for me at first. The default maximum packet for mysql is 1meg. You must change the max_packet_size in your mysql config file. Likely in /etc/my.cnf, you would add a 'max_packet_size=5M' line (for example) and restart the server.