http://qs321.pair.com?node_id=710891


in reply to (OT) should i limit number of files in a directory

Hi Leo,

I would use a simple n-level deep scheme which consists of a rootdir with subdirs a-z where each contain dirs a-z (this repeated n times), and an index (plain file or dbm). As many have noted such a scheme transforms the usual linear search in a directory in something closer to a binary search.

An iterator would give the dir-part a/a/b, a/a/c, ..., a/a/z, a/b/a,... and start over when the list is exhausted; this way it is easier to keep the entries (almost) equally distributed especially if a few processes are writing concurrently in your (virtual) filesystem.

The index key would be the name of the file. Additional meta-info can be attached easily

for example

  • key 7644ebf125065a6c220dcd35b5190e57 => a/a/b/7644ebf125065a6c220dcd35b5190e57, a/a/b/7644ebf125065a6c220dcd35b5190e57.1st_pass, etc
  • cheers --stephan