 go ahead... be a heretic PerlMonks

### Fisher-Yates theory... does this prove that it is invalid?

by MarkM (Curate)
 on Jul 25, 2003 at 01:00 UTC ( #277764=note: print w/replies, xml ) Need Help??

in reply to Re: Fisher-Yates theory
in thread Fisher-Yates theory

UPDATE: As pointed out by others, I made an error when translating the code. See their summaries for good explanations. Cheers, and thanks everyone. (tail between legs)

In a previous article, I was challenged for doubting the effectiveness of the Fisher-Yates shuffle as described in perlfaq.

Below, I have written code that exhausts all possible random sequences that could be used during a particular Fisher-Yates shuffle. Statistically, this should be valid, as before the shuffle begins, there is an equal chance that the random sequence generated could be 0 0 0 0 0 as 0 1 2 3 4 as 4 4 4 4 4. By exhaustively executing the Fisher-Yates shuffle, and calculating the total number of occurrences that each result set is produced, we can determine whether the Fisher-Yates shuffle has the side effect of weighting the results, or whether the shuffle is truly random, in that it should be approximately well spread out.

```my \$depth = 5;
my %results;

sub recurse
{
if (@_ >= \$depth) {
my @deck = (1 .. \$depth);
shuffle(\@deck, [@_]);
\$results{join('', @deck)}++;
} else {
recurse(@_, \$_) for 0 .. (\$depth-1);
}
}

sub shuffle
{
my(\$deck, \$rand) = @_;
my \$i = @\$deck;
while (\$i--) {
my \$j = shift @\$rand;
@\$deck[\$i,\$j] = @\$deck[\$j,\$i];
}
}

recurse;

for (sort {\$results{\$b} <=> \$results{\$a}} keys %results) {
printf "%10d %s\n", \$results{\$_}, \$_;
}
[download]```

With the above code, I was able to determine that with a deck size of 5, and an initial set of 1 2 3 4 5, there is three times the probability that the resulting set will be 3 1 2 5 4 than the probability that the resulting set will be 2 3 4 5 1. To me, this indicates that this theory is flawed.

If anybody needs to prove to themselves that the test is exhaustive, print out "@\$rand" in the shuffle subroutine.

Please analyze the code carefully, pull out your school books, and see if I have made a mistake.

Cheers,
mark

Replies are listed 'Best First'.
Re: Fisher-Yates theory... does this prove that it is invalid?
by jsprat (Curate) on Jul 25, 2003 at 01:51 UTC
Hi MarkM,

That algorithm shows the possible results of a biased shuffle, not a Fisher-Yates shuffle. The random sequence generated would not be 00000 to 44444, it would be 0000 to 4321 (a five digit shuffle requires 4 iterations - the faq goes 5, but the last never swaps - with each iteration shuffling one less item).

The while loop in shuffle needs one less iteration, and a minor adjustment to recurse would look like this:

```#!/usr/bin/perl
use strict;
use warnings;

my \$depth = 4;
my %results;

sub recurse
{
if (@_ == \$depth) {
shift; #discard \$num
my @deck = (1 .. \$depth);
shuffle(\@deck, [@_]);
\$results{join('', @deck)}++;
} else {
my \$num = shift || \$depth - 1;
# one less element each iteration
recurse(\$num, @_, \$_) for 0 .. \$num--;
}
}

sub shuffle
{
my(\$deck, \$rand) = @_;
my \$i = @\$deck;
# uncomment the following line
# print "@\$rand\n";
# pre-decrement \$i instead of post - the last would be a no-op in
+this case
while (--\$i) {
my \$j = shift @\$rand;
@\$deck[\$i,\$j] = @\$deck[\$j,\$i];
}
}

recurse;
for (sort {\$results{\$b} <=> \$results{\$a}} keys %results) {
printf "%10d %s\n", \$results{\$_}, \$_;
}
[download]```

Here are the results of the modifications, using 4 elements instead of 5 (only 24 possible permutations instead of 120 - makes the node much more readable ;):

Each possible permutation is shown exactly one time, for a possibility of being selected 1 out of 24 times (assuming a perfect rng).

Makes sense???

Update: I followed BrowserUK's link below and in that thread there is a statement that elegantly describes the problem with a biased shuffle (When the Best Solution Isn't), by blakem: "It maps 8 paths to 6 end states". In this case, it's 3125 (5**5) paths to 120 (5!) end states - assuming 5 elements to be shuffled.

Re: Fisher-Yates theory... does this prove that it is invalid?
by BrowserUk (Pope) on Jul 25, 2003 at 02:04 UTC

The problem is, your shuffle routine is not an implementation of a Fisher-Yates shuffle.

This line

```    my \$j = shift @\$rand;
[download]```

is no way equivalent to this line from the FAQ implementation

```     my \$j = int rand (\$i+1);
[download]```

The latter picks a swap partner for the current value of \$i, randomly between 0 and \$i-1. I can't quite wrap my brain around what your code is doing here, but it isn't even vaguely equivalent.

Therefore you are not testing a Fisher-Yates shuffle, but some shuffle algorithm of your own invention, which you succeed in proving isn't as good as the Fisher-Yates.

You might find this post Re: When the Best Solution Isn't that does a statistical analysis of several shuffle routines, a Fisher-Yates amongst them, including frequency and standard deviation interesting.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

Re: Fisher-Yates theory... does this prove that it is invalid?
by adrianh (Chancellor) on Jul 25, 2003 at 01:58 UTC
Please analyze the code carefully, pull out your school books, and see if I have made a mistake.

Yes you have. You're not emulating a Fisher-Yates shuffle :-)

Consider the original code from perlfaq:

```sub fisher_yates_shuffle {
my \$deck = shift;  # \$deck is a reference to an array
my \$i = @\$deck;
while (\$i--) {
print \$i, "\n";
my \$j = int rand (\$i+1);
@\$deck[\$i,\$j] = @\$deck[\$j,\$i];
}
}
[download]```

Note how \$i is decremented on each iteration. Consider how that alters the sequence of possible indices.

Once you take that into account you get the textbook behaviour.

```sub fixed_fisher_yates_shuffle {
my (\$deck, \$rand) = @_;
my \$i = @\$deck;
while (\$i--) {
my \$j = shift @\$rand;
@\$deck[\$i,\$j] = @\$deck[\$j,\$i];
}
}

use Set::CrossProduct;
my \$i = Set::CrossProduct->new([ [0..4], [0..3], [0..2], [0..1],  ]
+);
my %count;
while (my \$a = \$i->get) {
print "@\$a : ";
my @foo = (1,2,3,4,5);
fixed_fisher_yates_shuffle(\@foo, \$a);
print "@foo\n";
\$count{"@foo"}++;
};

foreach my \$key (sort keys %count) {
print \$key, " = ", \$count{\$key}, "\n";
};
[download]```

Log In?
 Username: Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://277764]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2021-04-10 22:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?