Come for the quick hacks, stay for the epiphanies. | |
PerlMonks |
Re: Removing repeated lines from fileby ant9000 (Monk) |
on Jun 24, 2003 at 11:47 UTC ( [id://268461]=note: print w/replies, xml ) | Need Help?? |
If you can keep track of lines already read, it's trivial: If that's too big for your memory to hold (is it, really?), you could try to get a unique signature for each line and save that. What about an MD5 hash of it? It's 32 bites per input line, so it could be a good starting point. Beware, MD5 is not that fast if you have billions of lines!
In Section
Seekers of Perl Wisdom
|
|