 Clear questions and runnable codeget the best and fastest answer PerlMonks

### Re^2: Random Derangement Of An Array

by dcturner (Initiate)
 on Dec 04, 2008 at 19:54 UTC ( #728077=note: print w/replies, xml ) Need Help??

in reply to Re: Random Derangement Of An Array
in thread Random Derangement Of An Array

Here's the approach I wanted to code up, but I've been sufficiently distracted.

After much brain-racking I have been unable to come up with a less silly approach than simply encoding the combinatorial proof that d_(n+1) = n (d_n + d_(n-1)) in Perl, which was essentially your idea.

All of the other ideas in this thread so far produce a biased distribution; the 'obvious' modification to the Fisher-Yates shuffle algorithm looked promising but misses some derangements entirely.

It might be straightforward to do it purely iteratively too, but my brain works recursively so my programs do too!

```#!/usr/bin/perl

use strict;

sub d_l_rec {
# Calculates the pair (d_n, d_{n-1})
#
# Why calculate the pair? It's O(n) instead of O(2^n)
# for the 'obvious' recursive algorithm.
#
# Why recursive? No need to do it iteratively.
my (\$n) = @_;
return 1 if \$n < 1;
return (0, 1) if \$n == 1;
my (\$d1, \$d2) = d_l_rec(\$n-1);
my \$d = (\$n-1) * (\$d1 + \$d2);
return (\$d, \$d1);
}

sub random_local_derangement {
# Returns a randomly-chosen local derangement of
# (0..(\$n-1)).  A local derangement is a derangement
# *except* that the last place may be a fixed point.
#
# It's 'local' in the sense that, given \$n people and a
# hat of \$n tickets you can generate a local derangement
# by each person in turn pulling tickets out of the hat
# until they get one that isn't theirs, ensuring that
# (local to their draw) the permutation looks like it's
# going to be a derangement. This is how 'Secret Santa'
# derangements often seem to be organised, and this
# process sometimes leaves the last person with their
# own ticket.
#
# Unfortunately the 'Secret Santa' approach definitely
# does not give uniform probabilities to each outcome.
# This function, on the other hand, does. [At least,
# it's definitely uniform for \$n <= 12 and merely a very
# good approximation for larger \$n.]

my (\$n) = @_;
if (\$n == 0) {
return [];
}

my (\$i, \$threshold);
# A local \$n-derangement is either a full 'total'
# \$n-derangement or else it is a (\$n-1)-derangement with
# a fixed point added at the end. We must choose between
# these options with appropriate probability weighting.
# Note that this means that l_k = d_k + d_{k-1} where
# l_k is the number of local k-derangements and d_k is
# the number of total k-derangements.
#
# Note that d_{12} < 2^31 < d_{13} so we have to be
# careful of overflow for \$n > 13. Fortunately there's a
# good approximation that can be used.
if (\$n <= 12) {
# Calculate d_{\$n} and d_{\$n-1} and a random value
# \$i in [0, d_{\$n} + d_{\$n-1}) to decide which of
# the two options to use.
my (\$dn, \$dn1) = d_l_rec(\$n);
\$i = int(rand(\$dn+\$dn1));
\$threshold = \$dn1;
} else {
# If \$n is large then d_\$n/d_{\$n-1} is very close to
# \$n.  Therefore it'll do to pick a random \$i in the
# range [0, \$n] and see if it is 0 or not.
\$i = int(rand(\$n+1));
\$threshold = 1;

# I think this is ok, but it just might contain an
# off-by-one error. The upshot of such an error
# would be a degree of bias in the results that is
# going to be hard to detect - you may have to run
# it literally trillions of times to pick up a
# statistically significant result.
}

if (\$i < \$threshold) {
# Case 1 - pick a properly local derangement
my \$d = random_derangement(\$n-1);
push @\$d, \$n-1;
return \$d;
} else {
# Case 2 - pick a total derangement
my \$d = random_derangement(\$n);
return \$d;
}
}

sub random_derangement {
# Returns a randomly-chosen (total) derangement of
# (0..(\$n-1)), uniformly-chosen amongst all possible
# derangements.

my (\$n) = @_;
if (\$n == 0) {
return [];
}

# There are (n-1) l_{n-1} of them, so pick a (uniformly)
# random local (\$n-1)-derangement and a random \$m in the
# range [0, \$n-1).
my \$ld = random_local_derangement(\$n-1);
my \$m = int(rand(\$n-1));

# If L_k is the set of all local k-derangements and D_k
# is the set of all total k-derangements then the code
# below encodes the proof that (n-1) l_{n-1} = d_n in a
# bijection between [0, \$n-1) x L_{n-1} and D_{n}.
#
# Since the pair (\$m, \$ld) are chosen uniformly, this
# shows that the resulting derangement is also uniformly
# chosen.

if (\$n-2 == \$ld->[\$n-2]) {
# \$ld is properly local. Therefore the desired
# derangement swaps the \$m'th and last places and
# uses \$ld to derange the other places.

my \$j = \$n-1;
while (\$j--) {
my \$k = \$j < \$m ? \$j : \$j-1;
\$ld->[\$j]
= \$ld->[\$k] < \$m ? \$ld->[\$k] : \$ld->[\$k]+1;
}
\$ld->[\$n-1] = \$m;
\$ld->[\$m] = \$n-1;
return \$ld;
} else {
# \$ld is total. Therefore put the \$m'th entry at the
# end and put \$n-1 in the \$m'th place.

\$ld->[\$n-1] = \$ld->[\$m];
\$ld->[\$m] = \$n-1;
return \$ld;
}
}

sub check_derangement {
# Check that we have really generated a derangement
my (\$n, \$d) = @_;

my \$s = join ', ', @\$d;

die "Wrong length: \$s (\$n)" unless (\$n == @\$d);
for (my \$i = 0; \$i < \$n; \$i++) {
die "Not a derangement: \$s" if (\$i == \$d->[\$i]);
die "Illegal value: \$d->[\$i] in \$s"
if (\$d->[\$i] < 0 || \$d->[\$i] >= \$n);
}

eval {
my @check_unique = sort { \$a <=> \$b || undef } @\$d;
};
die "Uniqueness check failed: \$s" if \$@;
}

my \$n = \$ARGV;

my %f = ();
my \$c = 0;

for (my \$i = 0; \$i < 1e6; \$i++) {
my \$d = random_derangement(\$n);
check_derangement(\$n, \$d);
my \$s = join ',', @\$d;
\$f{\$s} += 1;
\$c += 1;
}

for my \$key (sort {\$f{\$a} <=> \$f{\$b}} keys %f) {
printf "%s: %0.2f%% (expected %+0.2f%%)\n", \$key, 100.0*(\$f{\$key}/
+\$c),
100.0*((\$f{\$key}/\$c)-(1.0/(scalar keys %f)));
}

printf "Total %d (%d runs)\n", (scalar keys %f), \$c;

This produces output like the following

```\$ ./derangements.pl 4
2,3,0,1: 11.08% (expected -0.03%)
1,0,3,2: 11.09% (expected -0.02%)
2,3,1,0: 11.10% (expected -0.01%)
1,3,0,2: 11.11% (expected -0.01%)
2,0,3,1: 11.11% (expected -0.00%)
1,2,3,0: 11.12% (expected +0.01%)
3,0,1,2: 11.12% (expected +0.01%)
3,2,1,0: 11.13% (expected +0.02%)
3,2,0,1: 11.14% (expected +0.03%)
Total 9 (1000000 runs)

Clearly that's not far off uniform!

Replies are listed 'Best First'.
Re^3: Random Derangement Of An Array
by blokhead (Monsignor) on Dec 07, 2008 at 20:38 UTC
Here is a recent article describing a simpler algorithm for sampling derangements. I also found slides for the presentation. Since the paper is so recent, I guess this means that a small modification of Fisher-Yates is unlikely to generate derangements, since someone would have already come up with it by now. Still, their algorithm is in-place and has better expected running time than retrying Fisher-Yates until you get a derangement.

Here is a Perl implementation I whipped up. It is slightly odd because I followed their lead and used array indexing from 1.

```sub rand_derangement {
my \$n = shift;
return if \$n == 1; ## no derangements of size 1

## precompute \$D[n] == number of derangements of size n
my @D = (1,0);
push @D, \$#D * (\$D[-1] + \$D[-2]) while \$#D < \$n;

my @A = (undef, 1 .. \$n);
my @mark = (1, (0) x \$n);

my (\$i, \$u) = (\$n, \$n);

while (\$u > 1) {
if (! \$mark[\$i]) {
my \$j = 0;
\$j = 1 + int rand(\$i-2) while \$mark[\$j];

@A[\$i,\$j] = @A[\$j,\$i];

if ( rand(1) < (\$u-1) * \$D[\$u-2] / \$D[\$u] ) {
\$mark[\$j] = 1;
\$u--;
}
\$u--;
}
\$i--;
}

return @A[1..\$n];
}

Here is a recent article describing a simpler algorithm for sampling derangements...

Ah yes, that's nice. I think there is a slight buglet in the given code:

\$j = 1 + int rand(\$i-2) while \$mark[\$j];

\$j = 1 + int rand(\$i-1) while \$mark[\$j];

should it not? Also it could hit an overflow bug for \$n > 12, because d_n gets quite large quite quickly. Even with 64-bit ints you overflow for \$n > 20. The same approximation works for your method as for mine: for \$n > 12 the value (n-1)d_{n-2} / d_n is very close to 1/n.

The iterative version of the recursive program that I wrote above is as follows. It's a bit convoluted because the recursion d_n = (n-1) (d_{n-1} + d_{n-2}) is second-order, so you have to work a bit harder than for a first-order recursion. It's also in-place and requires only a constant amount of auxiliary storage. Plus it certainly terminates, whereas the rejection method merely almost certainly terminates :) On the minus side, it is a bit quadratic (although with low probability). Side-by-side, they run pretty similarly it seems.

```sub random_derangement {
my (\$n) = @_;

my @d = (1, 0, 1, 2, 9, 44, 265, 1854, 14833, 133496,
1334961, 14684570, 176214841);

my @t;
my \$i = \$n;
while (\$i--) {
my \$m = int(rand(\$i));
\$t[\$i] = \$m;
if (\$i <= 12) {
\$i -= 1 if (int(rand(\$d[\$i]+\$d[\$i-1])) < \$d[\$i-1]);
} else {
\$i -= 1 if (int(rand(\$i+1)) == 0);
}
}

for (\$i = 0; \$i < \$n; \$i++) {
if (defined(\$t[\$i])) {
my \$m = \$t[\$i];
\$t[\$i] = \$t[\$m];
\$t[\$m] = \$i;
} else {
my \$j = \$i+1;
my \$m = \$t[\$i+1];
while (\$j--) {
my \$k = \$j < \$m ? \$j : \$j-1;
\$t[\$j]
= \$t[\$k] < \$m ? \$t[\$k] : \$t[\$k]+1;
}
\$t[\$i+1] = \$m;
\$t[\$m] = \$i+1;
\$i += 1;
}
}

return \@t;
}

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://728077]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2023-10-03 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?