in reply to (tye)Re: Sorting data that don't fit in memory
in thread Sorting data that don't fit in memory
I'm impressed again. I was just installing BerkeleyDB according to tilly's suggestion. That was a real nice one, as well. I like using existing modules, as opposed to some others@pm. However, your code seems to be exactly what I wanted.
Well, the size of the string $sorton shouldn't be much of a problem, it only uses (in my case) 2*13M = 26M of memory. Small offer, here.
I did a check whether my local 'sort' (qsort, apparently) messed with the order of equal keys:
So, it doesn't. I'm very happy with that. This, because the remainder of the string is some code for the time, stored in something like 100us accuray, with a max of 2-3 days. You don't want to sort on that, but you also don't want to change the order. I hope this explains my reluctancy to mix it up.#!/usr/bin/perl @data=qw(bb bbZZ aaZZ aaSD aaPM aaAA aa); print join " ", sort { substr($a,0,2) cmp substr($b,0,2) } @data; #Result: aaZZ aaSD aaPM aaAA aa bb bbZZ
I guess the key notion here is the collection of my keys in a string, saving the array-overhead.
Thanks a lot,
Jeroen
"We are not alone"(FZ)
Update I'm afraid I was too fast with my happiness. The real memory hog lies in the 0..($idx-1). That gives you 13M item array, that won't fit in memory. Too bad.