comment on

Here's the approach I wanted to code up, but I've been sufficiently distracted.

After much brain-racking I have been unable to come up with a less silly approach than simply encoding the combinatorial proof that d_(n+1) = n (d_n + d_(n-1)) in Perl, which was essentially your idea.

All of the other ideas in this thread so far produce a biased distribution; the 'obvious' modification to the Fisher-Yates shuffle algorithm looked promising but misses some derangements entirely.

It might be straightforward to do it purely iteratively too, but my brain works recursively so my programs do too!

#!/usr/bin/perl

use strict;

sub d_l_rec {
    # Calculates the pair (d_n, d_{n-1})
    #
    # Why calculate the pair? It's O(n) instead of O(2^n)
    # for the 'obvious' recursive algorithm.
    #
    # Why recursive? No need to do it iteratively. 
    my ($n) = @_;
    return 1 if $n < 1;
    return (0, 1) if $n == 1;
    my ($d1, $d2) = d_l_rec($n-1);
    my $d = ($n-1) * ($d1 + $d2);
    return ($d, $d1);
}

sub random_local_derangement {
    # Returns a randomly-chosen local derangement of
    # (0..($n-1)).  A local derangement is a derangement
    # *except* that the last place may be a fixed point.
    #
    # It's 'local' in the sense that, given $n people and a
    # hat of $n tickets you can generate a local derangement
    # by each person in turn pulling tickets out of the hat
    # until they get one that isn't theirs, ensuring that
    # (local to their draw) the permutation looks like it's
    # going to be a derangement. This is how 'Secret Santa'
    # derangements often seem to be organised, and this
    # process sometimes leaves the last person with their
    # own ticket.
    #
    # Unfortunately the 'Secret Santa' approach definitely
    # does not give uniform probabilities to each outcome.
    # This function, on the other hand, does. [At least,
    # it's definitely uniform for $n <= 12 and merely a very
    # good approximation for larger $n.]
    
    my ($n) = @_;
    if ($n == 0) {
        return [];
    }

    my ($i, $threshold);
    # A local $n-derangement is either a full 'total'
    # $n-derangement or else it is a ($n-1)-derangement with
    # a fixed point added at the end. We must choose between
    # these options with appropriate probability weighting.
    # Note that this means that l_k = d_k + d_{k-1} where
    # l_k is the number of local k-derangements and d_k is
    # the number of total k-derangements.
    #
    # Note that d_{12} < 2^31 < d_{13} so we have to be
    # careful of overflow for $n > 13. Fortunately there's a
    # good approximation that can be used.
    if ($n <= 12) {
        # Calculate d_{$n} and d_{$n-1} and a random value
        # $i in [0, d_{$n} + d_{$n-1}) to decide which of
        # the two options to use.
        my ($dn, $dn1) = d_l_rec($n);
        $i = int(rand($dn+$dn1));
        $threshold = $dn1;
    } else {
        # If $n is large then d_$n/d_{$n-1} is very close to
        # $n.  Therefore it'll do to pick a random $i in the
        # range [0, $n] and see if it is 0 or not.
        $i = int(rand($n+1));
        $threshold = 1;

        # I think this is ok, but it just might contain an
        # off-by-one error. The upshot of such an error
        # would be a degree of bias in the results that is
        # going to be hard to detect - you may have to run
        # it literally trillions of times to pick up a
        # statistically significant result.
    }

    if ($i < $threshold) {
        # Case 1 - pick a properly local derangement
        my $d = random_derangement($n-1);
        push @$d, $n-1;
        return $d;
    } else {
        # Case 2 - pick a total derangement
        my $d = random_derangement($n);
        return $d;
    }
}

sub random_derangement {
    # Returns a randomly-chosen (total) derangement of
    # (0..($n-1)), uniformly-chosen amongst all possible
    # derangements.

    my ($n) = @_;
    if ($n == 0) {
        return [];
    }

    # There are (n-1) l_{n-1} of them, so pick a (uniformly)
    # random local ($n-1)-derangement and a random $m in the
    # range [0, $n-1).
    my $ld = random_local_derangement($n-1);
    my $m = int(rand($n-1));

    # If L_k is the set of all local k-derangements and D_k
    # is the set of all total k-derangements then the code
    # below encodes the proof that (n-1) l_{n-1} = d_n in a
    # bijection between [0, $n-1) x L_{n-1} and D_{n}.
    #
    # Since the pair ($m, $ld) are chosen uniformly, this
    # shows that the resulting derangement is also uniformly
    # chosen.

    if ($n-2 == $ld->[$n-2]) {
        # $ld is properly local. Therefore the desired
        # derangement swaps the $m'th and last places and
        # uses $ld to derange the other places.

        my $j = $n-1;
        while ($j--) {
            my $k = $j < $m ? $j : $j-1;
            $ld->[$j] 
                = $ld->[$k] < $m ? $ld->[$k] : $ld->[$k]+1;
        }
        $ld->[$n-1] = $m;
        $ld->[$m] = $n-1;
        return $ld;
    } else {
        # $ld is total. Therefore put the $m'th entry at the
        # end and put $n-1 in the $m'th place.

        $ld->[$n-1] = $ld->[$m];
        $ld->[$m] = $n-1;
        return $ld;
    }
}

sub check_derangement {
    # Check that we have really generated a derangement
    my ($n, $d) = @_;

    my $s = join ', ', @$d;

    die "Wrong length: $s ($n)" unless ($n == @$d);
    for (my $i = 0; $i < $n; $i++) {
        die "Not a derangement: $s" if ($i == $d->[$i]);
        die "Illegal value: $d->[$i] in $s" 
            if ($d->[$i] < 0 || $d->[$i] >= $n);
    }

    eval {
        my @check_unique = sort { $a <=> $b || undef } @$d;
    };
    die "Uniqueness check failed: $s" if $@;
}

my $n = $ARGV[0];

my %f = ();
my $c = 0;

for (my $i = 0; $i < 1e6; $i++) {
    my $d = random_derangement($n);
    check_derangement($n, $d);
    my $s = join ',', @$d;
    $f{$s} += 1;
    $c += 1;
}

for my $key (sort {$f{$a} <=> $f{$b}} keys %f) {
    printf "%s: %0.2f%% (expected %+0.2f%%)\n", $key, 100.0*($f{$key}/
+$c),
        100.0*(($f{$key}/$c)-(1.0/(scalar keys %f)));
}

printf "Total %d (%d runs)\n", (scalar keys %f), $c;
[download]

This produces output like the following

$ ./derangements.pl 4
2,3,0,1: 11.08% (expected -0.03%)
1,0,3,2: 11.09% (expected -0.02%)
2,3,1,0: 11.10% (expected -0.01%)
1,3,0,2: 11.11% (expected -0.01%)
2,0,3,1: 11.11% (expected -0.00%)
1,2,3,0: 11.12% (expected +0.01%)
3,0,1,2: 11.12% (expected +0.01%)
3,2,1,0: 11.13% (expected +0.02%)
3,2,0,1: 11.14% (expected +0.03%)
Total 9 (1000000 runs)
[download]

Clearly that's not far off uniform!

In reply to Re^2: Random Derangement Of An Array by dcturner
in thread Random Derangement Of An Array by Limbic~Region

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


laziness, impatience, and hubris
	PerlMonks