trying pack('s', $number) way now...
That won't work. pack template 's' is signed shorts which only handles numbers from -32768 through +32767.
If your 10 digit numbers can go all the way up to 9,999,999,999, then the smallest pack template you could use is 'Q' -- if your perl supports 64-bit integers -- and that will produce 8-byte values which would only save you 20%.
Besides which, if you are going to store your packed numbers individually in an array, you won't save much memory as the size of the element itself is insignificant compared to the internals structures that hold them.
But, if instead, you stored the packed integers into a single string and then shuffled the string, you avoid teh storage overhead of the array.
If your largest number is less than 4,294,967,296 then you could get away with 'V' (or 'N') which is 4-bytes that would reduce the memory requirement to around 430MB which is doable.
You could then shuffle that using:
#! perl -slw
use strict;
use Time::HiRes qw[ time ];
my $t = time;
my $s = chr(0);
$s x= 112e6 * 4;
substr( $s, $_ * 4, 4 ) = pack 'V', scalar <> for 0 .. 112e6-1;
my $n = length( $s ) / 4;
for( 0 ..$n ) {
my $p = $_ + rand( $n - $_ );
my $x = substr( $s, $_*4, 4 );
substr( $s, $_*4,4 ) = substr( $s, $p*4,4 );
substr( $s, $p*4, 4 ) = $x;
}
print unpack 'V', substr( $s, $_ * 4, 4 ) for 0 .. $n;
print STDERR time - $t;
It'll take about 10 minutes to shuffle your 112 million numbers, which should be quick enough if this is a one off task.
Perhaps, if your need for true randomness isn't too crucial, then the simplest way to randomise your list would be to sort it using an external sort utility using an offset other than the first digit. That would give you 9 'random' possibilities.
If that's not sufficient you could pick two offset/lengths:sort -k 8,9 -k 3,4 in > out which gives you many possibilities. Especially if you realise that your keys can be of variable length and even overlap.
The downside is that unless your sort is substantially quicker than mine, it'll take well over half an hour. But again that might be okay if it is an infrequent requirement.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
|