http://qs321.pair.com?node_id=54011


in reply to (tye)Re: Sorting data that don't fit in memory
in thread Sorting data that don't fit in memory

Read the note at the bottom

I'm impressed again. I was just installing BerkeleyDB according to tilly's suggestion. That was a real nice one, as well. I like using existing modules, as opposed to some others@pm. However, your code seems to be exactly what I wanted.

Well, the size of the string $sorton shouldn't be much of a problem, it only uses (in my case) 2*13M = 26M of memory. Small offer, here.

I did a check whether my local 'sort' (qsort, apparently) messed with the order of equal keys:

#!/usr/bin/perl @data=qw(bb bbZZ aaZZ aaSD aaPM aaAA aa); print join " ", sort { substr($a,0,2) cmp substr($b,0,2) } @data; #Result: aaZZ aaSD aaPM aaAA aa bb bbZZ
So, it doesn't. I'm very happy with that. This, because the remainder of the string is some code for the time, stored in something like 100us accuray, with a max of 2-3 days. You don't want to sort on that, but you also don't want to change the order. I hope this explains my reluctancy to mix it up.

I guess the key notion here is the collection of my keys in a string, saving the array-overhead.

Thanks a lot,

Jeroen
"We are not alone"(FZ)

Update I'm afraid I was too fast with my happiness. The real memory hog lies in the 0..($idx-1). That gives you 13M item array, that won't fit in memory. Too bad.