http://qs321.pair.com?node_id=1000838


in reply to Re^2: Processing ~1 Trillion records
in thread Processing ~1 Trillion records

You seem to be accumulating lots of data in the hashes, are you sure it all fits in memory? As soon as you force the computer to swap memory pages to disk, the processing time grows insanely!

It might help to tie the hashes to a DBM file (DB_File, MLDBM, ...) or use a SQLite or some other database to hold the temporary data. Doing as much work as you can upfront in the Oracle database would most probably be even though. Sometimes a use DB_File;tie %data, 'DB_File', 'filename.db'; is all you need to change something from unacceptably slow to just fine.

Jenda
Enoch was right!
Enjoy the last years of Rome.

Replies are listed 'Best First'.
Re^4: Processing ~1 Trillion records
by aossama (Acolyte) on Oct 25, 2012 at 12:36 UTC
    Is this like using Redis to store/retrieve the key-value?

      Yes. You can use Redis itself, seems it does have a Perl binding. The whole point is to make sure the process fits in memory and the data that had to be moved to the disk is accessed/updated efficiently.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.