http://qs321.pair.com?node_id=612795


in reply to Perl solution for storage of large number of small files

I generate an md5 hash from the filename, and use that to create a file hierarchy. I split the hash into chunks, which I then use to create subdirectories. This keeps the number of entries per level within filesystem limits, and lookup stays relatively fast. The distributive properties of MD5 garantuee a good spread among buckets.

For example, a filename of "foo12345.txt" might map to "/4d9/a43/9b6/eb3/f21/600/448/a03/b78/5a3/17". Each subdirectory has a maximum of 4096 subdirectories, which is well within the limit of ext3. By varying where you place the slashes, you control the breadth and depth of your directory tree.

I personally use a mapping of "/11/222/333/rest_of_hash" to archive (currently) over 10 million files. That scheme allows for 256*4096*4096 buckets, which is quite enough for my purposes. Speed and server load are excellent, even under highly concurrent usage.

  • Comment on Re: Perl solution for storage of large number of small files

Replies are listed 'Best First'.
Re^2: Perl solution for storage of large number of small files
by isync (Hermit) on Apr 30, 2007 at 16:24 UTC
    Seems like exactly what I do now (after switching back to filesystem)!

    And the buckets? Are you talking about DB_Files you store away in this subdir structure (thousands of small dbs effectively, where you put foo12345.txt)? Or are you just spreading the files by the hash and then you store /hash/path/to/file/foo12345.txt?

      I concur with rhesa's method. I've used this before with considerable success.

      I just untarred a test structure containing a million files distributed this way using 3 levels of subdirectory to give an average of ~250 file per directory. I then ran a quick test of opening and reading 10,000 files at random and got an average time to locate, open, read and close each file of 12ms.

      #! perl -slw use strict; use Math::Random::MT qw[ rand ]; use Digest::MD5 qw[ md5_hex ]; use Benchmark::Timer; our $SAMPLE ||= 1000; my $T = new Benchmark::Timer; for my $i ( 1 .. $SAMPLE ) { $T->start( 'encode/open/read/close' ); my $md5 = md5_hex( int( rand 1e6 ) ); my( $a, $b, $c ) = unpack 'AAA', $md5 ; $T->start( 'open' ); open my $fh, '<', "fs/$a/$b/$c/$md5.dat" or warn "fs/$a/$b/$c/$md5 + : $!"; $T->stop( 'open' ); $T->start( 'read' ); my $data = do{ local $/; <$fh> }; $T->stop( 'read' ); $T->start( 'close' ); close $fh; $T->stop( 'close' ); $T->stop( 'encode/open/read/close' ); } $T->report; __END__ c:\test>612729-r -SAMPLE=10000 10000 trials of encode/open/read/close (112.397s total), 11.240ms/tria +l 10000 trials of open (110.562s total), 11.056ms/trial 10000 trials of read (158.554ms total), 15us/trial 10000 trials of close (365.520ms total), 36us/trial

      The files in this case are all 4k, but that doesn't affect your seek time. If you envisage needing to deal with much more than 1 million files, moving to 4 levels of hierarchy woudl distribute the million files at just 15 per directory.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      Ah, sorry about that: my use of the word "bucket" was a bit sloppy. I just store the file (I deal with images, mostly).
Re^2: Perl solution for storage of large number of small files
by MidLifeXis (Monsignor) on May 02, 2007 at 18:41 UTC

    Be aware of where your filesystem slows down. There is often times a limit on the number of files that you can store in a directory before access starts getting slow. Adding another level of directories can be faster than adding more directories to the current one.

    While old, this explains my ramblings from above.

    --MidLifeXis

Re^2: Perl solution for storage of large number of small files
by 2xlp (Sexton) on May 03, 2007 at 02:37 UTC
    I do *almost* the same exact thing, for a similarly large (and growing) archive I do:
    /123/456/789/123456789...32 or /123/456/789/entire-hash
    agreed on the notion that "The distributive properties of MD5 garantuee a good spread among buckets.". You should never do this with numbers, but if you must, you can reverse the numbers to get a decent distribution ( ie - 12345 -> 54321 ) another difference that i do is the following-- i name the file the md5, and store the original file name in postgres. that gives me a speedup on fileserving. its relatively simple to create rewrite rules under apache & nginx to directly serve these files too.
Re^2: Perl solution for storage of large number of small files
by 2xlp (Sexton) on May 06, 2007 at 02:18 UTC

    I forgot to add...

    The 'best' way I discovered to store info is to use a base32 encoded digest of the file.

    Why? well, most filesystems start to have issues at some point after 1000 items per directory. using a base32 representation, you can hit the sweet spot.

    with a base32 formula and 2chars per bin, you'll have 1024 bins per depth ( 32*32 ). using the standard hex based md5 representation, your options are either 256 buckets per depth ( 16*16 ) or 4096 ( 16*16*16 ).

    in my personal use, i haven't seen all that much of a difference between 1024 and 4096 buckets -- though i've seen a slight difference. its not as drastic as the performance between either and 10k though.

    since i'm lazy and I don't have high performace constraints, i just go with 2 levels deep of 4096 hashing. but if i had more time, I'd definitely go with 3 levels of 1024 hashing.

    ( the time / lazyness is a factor because every language supports md5 as base16 or base64 by default -- no one has a base32 default , which is a PITA if you're managing an architecture accessed by multiple languages at once ).