Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: 15 billion row text file and row deletes - Best Practice?

by tubaandy (Deacon)
on Dec 01, 2006 at 15:42 UTC ( [id://587215]=note: print w/replies, xml ) Need Help??


in reply to Re: 15 billion row text file and row deletes - Best Practice?
in thread 15 billion row text file and row deletes - Best Practice?

Alex has a good point, you could put together a script to grab chunks of the big original file (say 1 million line chunks) and write that to another temp file. Then follow the method where you read the deletes into a hash, and parse through the temp file, appending the good lines to the final file. This way, you'll have 4 files at any one time: the original file, the delete file, the chunk temp file, and the final file. Then again, you'll be butting up against your disk space limit...

Anyway, just a thought.

tubaandy
  • Comment on Re^2: 15 billion row text file and row deletes - Best Practice?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://587215]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-19 13:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found