http://qs321.pair.com?node_id=277764


in reply to Re: Fisher-Yates theory
in thread Fisher-Yates theory

UPDATE: As pointed out by others, I made an error when translating the code. See their summaries for good explanations. Cheers, and thanks everyone. (tail between legs)

In a previous article, I was challenged for doubting the effectiveness of the Fisher-Yates shuffle as described in perlfaq.

Below, I have written code that exhausts all possible random sequences that could be used during a particular Fisher-Yates shuffle. Statistically, this should be valid, as before the shuffle begins, there is an equal chance that the random sequence generated could be 0 0 0 0 0 as 0 1 2 3 4 as 4 4 4 4 4. By exhaustively executing the Fisher-Yates shuffle, and calculating the total number of occurrences that each result set is produced, we can determine whether the Fisher-Yates shuffle has the side effect of weighting the results, or whether the shuffle is truly random, in that it should be approximately well spread out.

my $depth = 5; my %results; sub recurse { if (@_ >= $depth) { my @deck = (1 .. $depth); shuffle(\@deck, [@_]); $results{join('', @deck)}++; } else { recurse(@_, $_) for 0 .. ($depth-1); } } sub shuffle { my($deck, $rand) = @_; my $i = @$deck; while ($i--) { my $j = shift @$rand; @$deck[$i,$j] = @$deck[$j,$i]; } } recurse; for (sort {$results{$b} <=> $results{$a}} keys %results) { printf "%10d %s\n", $results{$_}, $_; }

With the above code, I was able to determine that with a deck size of 5, and an initial set of 1 2 3 4 5, there is three times the probability that the resulting set will be 3 1 2 5 4 than the probability that the resulting set will be 2 3 4 5 1. To me, this indicates that this theory is flawed.

If anybody needs to prove to themselves that the test is exhaustive, print out "@$rand" in the shuffle subroutine.

Please analyze the code carefully, pull out your school books, and see if I have made a mistake.

Cheers,
mark