http://qs321.pair.com?node_id=612765


in reply to Perl solution for storage of large number of small files

I'm also curious about why you don't want to use a database. Is it because you want to be absolutely sure of when data is safely stored on the disk, or because it is faster this way? IIRC Oracle was designed at least originally to take advantage of physical layout on the disk, though perhaps not so important these days. Did you try dumping this into Mysql or PostgreSQL and dislike the solution for some reason? Have you tried sleepycat's BDB?

Incidentally InnoDB performance tuning tips notes:

Wrap several modifications into one transaction. InnoDB must flush the log to disk at each transaction commit if that transaction made modifications to the database. The rotation speed of a disk is typically at most 167 revolutions/second, which constrains the number of commits to the same 167th of a second if the disk does not fool the operating system.

So one disk rotation is 6msec minimum right there. Are you spreading your tied files across several disks? Do you require every write to be saved to disk physically instantaneously, or can you wait a second or so?

Oh, the other thing is if you have disk to burn you could increase your inode size, on XFS, or mirror your disks for speed. But regardless, it seems that moving to a database implementation now rather than waiting for things to explode might be a good idea. I don't suppose your system could do locking to handle multiple writers, could it? Perhaps more info about what you are actually trying to do would be useful.

Also, I was thinking about a presentation at YAPC::Asia I think it was, about how a large service was built on Perl. Livedoor or Mixy. Anyway they split their indices and tables across different servers (using the first characters of user names IIRC). They built a system capable of easily repartitioning this layout as users increase.