Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Pick k numbers at random

by Chuma (Scribe)
on Nov 12, 2019 at 14:09 UTC ( [id://11108588]=note: print w/replies, xml ) Need Help??


in reply to Pick k numbers at random

Thank you for your replies!

Shuffling an array and then picking the first k elements would do what I want, but that's still linear in n. The "sample" module seems to be efficient, though.

Replies are listed 'Best First'.
Re^2: Pick k numbers at random
by haukex (Archbishop) on Nov 12, 2019 at 15:20 UTC
    Shuffling an array and then picking the first k elements would do what I want, but that's still linear in n. The "sample" module seems to be efficient, though.

    If you're worried about efficiency, you should measure first.

    #!/usr/bin/env perl use warnings; use strict; use Math::Prime::Util qw/randperm/; use List::Util qw/shuffle/; use Algorithm::Numerical::Sample qw/sample/; use Benchmark qw/cmpthese/; die "Usage: $0 N K\n" unless @ARGV==2; my ($N,$K) = @ARGV; cmpthese(-2, { mpu => sub { my @range = randperm($N, $K); }, shuf => sub { my @range = (shuffle 0 .. $N-1)[0 .. $K - 1]; }, samp => sub { my @range = sample (-set => [0 .. $N-1], -sample_size => $K); }, }); __END__ $ perl bench.pl 10 5 Rate samp shuf mpu samp 326934/s -- -86% -91% shuf 2388541/s 631% -- -31% mpu 3456068/s 957% 45% -- $ perl bench.pl 100 5 Rate samp shuf mpu samp 76377/s -- -86% -98% shuf 539280/s 606% -- -86% mpu 3744910/s 4803% 594% -- $ perl bench.pl 10000 5 Rate samp shuf mpu samp 835/s -- -86% -100% shuf 5885/s 605% -- -100% mpu 3719610/s 445489% 63101% -- $ perl bench.pl 100000 5000 Rate samp shuf mpu samp 72.2/s -- -87% -99% shuf 562/s 678% -- -91% mpu 6544/s 8958% 1064% --

    Not surprising, since Math::Prime::Util's randperm is implemented in C, and has a bunch of different methods for picking K of N depending on set sizes. See the source.

      Hello haukex,

      just for fun a solution exploiting the randomness of hash keys.

      hkey => sub{ my @range = (keys %{+{map {($_ => 1)}0 .. $N-1}})[0 .. $K-1]; }

      It is slightly faster than samp for very small sets..

      perl randomshuffle.pl 10 5 Rate samp hkey shuf mpu samp 108393/s -- -13% -88% -92% hkey 124178/s 15% -- -87% -90% shuf 934174/s 762% 652% -- -27% mpu 1275273/s 1077% 927% 37% --
      ..but becomes fastly slower ;) for bigger ones.
      perl randomshuffle.pl 100 5 Rate hkey samp shuf mpu hkey 15844/s -- -46% -93% -99% samp 29546/s 86% -- -86% -98% shuf 212991/s 1244% 621% -- -86% mpu 1519656/s 9491% 5043% 613% --

      The only thing to note is the %{+{ LIST }} syntax, where + is used to disambiguate a hashref from a block (credit: perl IRC channel) because you can't dereference a map as a hash.

      PS: your hardware is ~3 times faster than mine ;)

      PPS: "randperm" is not exported by the Math::Prime::Util module in 0.60 so i upgraded to 0.73

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        For small hashes, ordering isn't necessarily "random" from one hash to the next:

        c:\@Work\Perl\monks>perl -wMstrict -le "for (1 .. 8) { my @k = keys %{ { map { $_ => 1 } 0 .. 8 } }; print qq{@k}; } " 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5 6 3 7 2 8 1 4 0 5
        Same results under ActiveState 5.8.9 and Strawberry 5.14.4.1.


        Give a man a fish:  <%-{-{-{-<

      Not surprising, since Math::Prime::Util's randperm is implemented in C

      I think it's more about the fact that mpu creates $K scalars, while the other two creates at least $N.

      That's why the performance of shuf approaches that of mpu as $K approaches $N.

Re^2: Pick k numbers at random
by NERDVANA (Deacon) on Nov 13, 2019 at 09:39 UTC
    If you're dealing with a massive N (like billions that can't be allocated effectively) and small K, here's a quick algorithm:
    my %remap; my @result; while ($K--) { my $x= int rand $N--; push @result, $remap{$x} // $x; $remap{$x}= $N; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11108588]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2024-04-24 07:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found