http://qs321.pair.com?node_id=913683


in reply to renaming 1000's of FASTA files

If the total data size will sensibly -fit- in memory at the same time, then you can use a hash as you are doing now. Otherwise sort the files identically and write code that compares the two sorted files. Or ... put the data into an SQLite database file, which is a flat-file requiring no server at all, and use queries.

Replies are listed 'Best First'.
Re^2: renaming 1000's of FASTA files
by garyboyd (Acolyte) on Jul 11, 2011 at 15:00 UTC

    Thanks anonymous monk, would using index instead of the regex speed things up noticeably? I decided to split the input file and run multiple processes.