http://qs321.pair.com?node_id=1204602


in reply to Loading only portions of a file at any given time

What TKCooper said, but ... you could also do something more complex like build an "index" and store it in another file and then use that index to determine which parts to load when. But, the cost of that method (in time and complexity and developing the thing that reads based on index and in maintenance) may out-weigh the benefit you gain. Though, if this is a commonish thing to want, maybe everyone who uses FASTA could benefit from a module that does this indexing? I dunno.

  • Comment on Re: Loading only portions of a file at any given time

Replies are listed 'Best First'.
Re^2: Loading only portions of a file at any given time
by Anonymous Monk on Nov 30, 2017 at 16:18 UTC
    Actually, a compromise suggestion might not be so bad ... Since the file is sorted, one would only need to read the file sequentially once, and note the binary seek() position where each sequence begins. (It ends where the next one begins.) You might even be able to keep that data in memory. Or, write it to a separate file that each script loads. Each script could now seek directly to the right spot, read the specified number of bytes, and do a couple of quick debugging-checks to make sure that what it just read looks okay. (i.e. that the index-file is not out of date.) Simple and easy to do, and it just might save a lot of time. (If the region is very-large, consider also File::Map.)