in reply to Re: Random Derangement Of An Array

in thread Random Derangement Of An Array

*Here's the approach I wanted to code up, but I've been sufficiently distracted.*

After much brain-racking I have been unable to come up with a less silly approach than simply encoding the combinatorial proof that d_(n+1) = n (d_n + d_(n-1)) in Perl, which was essentially your idea.

All of the other ideas in this thread so far produce a biased distribution; the 'obvious' modification to the Fisher-Yates shuffle algorithm looked promising but misses some derangements entirely.

It might be straightforward to do it purely iteratively too, but my brain works recursively so my programs do too!

#!/usr/bin/perl use strict; sub d_l_rec { # Calculates the pair (d_n, d_{n-1}) # # Why calculate the pair? It's O(n) instead of O(2^n) # for the 'obvious' recursive algorithm. # # Why recursive? No need to do it iteratively. my ($n) = @_; return 1 if $n < 1; return (0, 1) if $n == 1; my ($d1, $d2) = d_l_rec($n-1); my $d = ($n-1) * ($d1 + $d2); return ($d, $d1); } sub random_local_derangement { # Returns a randomly-chosen local derangement of # (0..($n-1)). A local derangement is a derangement # *except* that the last place may be a fixed point. # # It's 'local' in the sense that, given $n people and a # hat of $n tickets you can generate a local derangement # by each person in turn pulling tickets out of the hat # until they get one that isn't theirs, ensuring that # (local to their draw) the permutation looks like it's # going to be a derangement. This is how 'Secret Santa' # derangements often seem to be organised, and this # process sometimes leaves the last person with their # own ticket. # # Unfortunately the 'Secret Santa' approach definitely # does not give uniform probabilities to each outcome. # This function, on the other hand, does. [At least, # it's definitely uniform for $n <= 12 and merely a very # good approximation for larger $n.] my ($n) = @_; if ($n == 0) { return []; } my ($i, $threshold); # A local $n-derangement is either a full 'total' # $n-derangement or else it is a ($n-1)-derangement with # a fixed point added at the end. We must choose between # these options with appropriate probability weighting. # Note that this means that l_k = d_k + d_{k-1} where # l_k is the number of local k-derangements and d_k is # the number of total k-derangements. # # Note that d_{12} < 2^31 < d_{13} so we have to be # careful of overflow for $n > 13. Fortunately there's a # good approximation that can be used. if ($n <= 12) { # Calculate d_{$n} and d_{$n-1} and a random value # $i in [0, d_{$n} + d_{$n-1}) to decide which of # the two options to use. my ($dn, $dn1) = d_l_rec($n); $i = int(rand($dn+$dn1)); $threshold = $dn1; } else { # If $n is large then d_$n/d_{$n-1} is very close to # $n. Therefore it'll do to pick a random $i in the # range [0, $n] and see if it is 0 or not. $i = int(rand($n+1)); $threshold = 1; # I think this is ok, but it just might contain an # off-by-one error. The upshot of such an error # would be a degree of bias in the results that is # going to be hard to detect - you may have to run # it literally trillions of times to pick up a # statistically significant result. } if ($i < $threshold) { # Case 1 - pick a properly local derangement my $d = random_derangement($n-1); push @$d, $n-1; return $d; } else { # Case 2 - pick a total derangement my $d = random_derangement($n); return $d; } } sub random_derangement { # Returns a randomly-chosen (total) derangement of # (0..($n-1)), uniformly-chosen amongst all possible # derangements. my ($n) = @_; if ($n == 0) { return []; } # There are (n-1) l_{n-1} of them, so pick a (uniformly) # random local ($n-1)-derangement and a random $m in the # range [0, $n-1). my $ld = random_local_derangement($n-1); my $m = int(rand($n-1)); # If L_k is the set of all local k-derangements and D_k # is the set of all total k-derangements then the code # below encodes the proof that (n-1) l_{n-1} = d_n in a # bijection between [0, $n-1) x L_{n-1} and D_{n}. # # Since the pair ($m, $ld) are chosen uniformly, this # shows that the resulting derangement is also uniformly # chosen. if ($n-2 == $ld->[$n-2]) { # $ld is properly local. Therefore the desired # derangement swaps the $m'th and last places and # uses $ld to derange the other places. my $j = $n-1; while ($j--) { my $k = $j < $m ? $j : $j-1; $ld->[$j] = $ld->[$k] < $m ? $ld->[$k] : $ld->[$k]+1; } $ld->[$n-1] = $m; $ld->[$m] = $n-1; return $ld; } else { # $ld is total. Therefore put the $m'th entry at the # end and put $n-1 in the $m'th place. $ld->[$n-1] = $ld->[$m]; $ld->[$m] = $n-1; return $ld; } } sub check_derangement { # Check that we have really generated a derangement my ($n, $d) = @_; my $s = join ', ', @$d; die "Wrong length: $s ($n)" unless ($n == @$d); for (my $i = 0; $i < $n; $i++) { die "Not a derangement: $s" if ($i == $d->[$i]); die "Illegal value: $d->[$i] in $s" if ($d->[$i] < 0 || $d->[$i] >= $n); } eval { my @check_unique = sort { $a <=> $b || undef } @$d; }; die "Uniqueness check failed: $s" if $@; } my $n = $ARGV[0]; my %f = (); my $c = 0; for (my $i = 0; $i < 1e6; $i++) { my $d = random_derangement($n); check_derangement($n, $d); my $s = join ',', @$d; $f{$s} += 1; $c += 1; } for my $key (sort {$f{$a} <=> $f{$b}} keys %f) { printf "%s: %0.2f%% (expected %+0.2f%%)\n", $key, 100.0*($f{$key}/ +$c), 100.0*(($f{$key}/$c)-(1.0/(scalar keys %f))); } printf "Total %d (%d runs)\n", (scalar keys %f), $c;

This produces output like the following

$ ./derangements.pl 4 2,3,0,1: 11.08% (expected -0.03%) 1,0,3,2: 11.09% (expected -0.02%) 2,3,1,0: 11.10% (expected -0.01%) 1,3,0,2: 11.11% (expected -0.01%) 2,0,3,1: 11.11% (expected -0.00%) 1,2,3,0: 11.12% (expected +0.01%) 3,0,1,2: 11.12% (expected +0.01%) 3,2,1,0: 11.13% (expected +0.02%) 3,2,0,1: 11.14% (expected +0.03%) Total 9 (1000000 runs)

Clearly that's not far off uniform!

In Section
Seekers of Perl Wisdom