Re^6: Functional shuffle

Actually, selecting with an exact probability of 1/3 (or any other rational fraction less than one) is feasible with even a source of single random bits. I saw the technique described once in "Mathematical Games" in Scientific American once.

Represent 1/3 as a binary fraction: .01010101... and generate random numbers bit by bit, starting from the decimal point, calling this a new binary fraction. If the next bit we generate is 1 where there is a 0 in the target, then quit -- we are over 1/3. If the next bit is 0 where there is a 1 in the target, then quit -- we are under 1/3 and we can accept the case. If it's equal, keep generating bits. The probability we will have to continue adding bits halves with each bit.

This approach can make the Fisher-Yates shuffle arbitrarily accurate. It would be possible, but messy, to apply it to sort values that compared equal, too. With this enhancement both shuffles should be fair, but Fisher-Yates wins by being O(N) instead of O(NlogN).

Comment on Re^6: Functional shuffle

Replies are listed 'Best First'.

Re^7: Functional shuffle
by tlm (Prior) on Apr 03, 2005 at 04:59 UTC

This approach can make the Fisher-Yates shuffle arbitrarily accurate. It would be possible, but messy, to apply it to sort values that compared equal, too.

I fail to see the difference. Certainly one can make any numerically-limited algorithm "arbitrarily accurate" if one is willing to increase the number of bits used to represent the numbers in the calculation. The variation of Fisher-Yates that you propose would require a special check to handle the highly improbable case that the random number generator produced a number that was exactly equal to k/N, for any integer k in 0..N - 1, in order to generate more bits to break the tie (without this provision, the algorithm is identical to the standard Fisher-Yates as far as the uniformity of the sampling is concerned). Likewise, the tag-sorting shuffle algorithm I posted would need to be modified to handle the highly improbable case that two of the random tags happened to be identical (which would result in a stable-sort artifact), by generating more bits to break the tie.

...but Fisher-Yates wins by being O(N) instead of O(NlogN).

Yes, of course, but the speed superiority of Fisher-Yates has never been in question. My position all along has been limited to defending the algorithm I posted against the claim that it was logically flawed in the same way as the sort-based algorithms discussed in When the Best Solution Isn't are. The problem with those algorithms would remain even if we had infinite-precision computers at our disposal; this is not the case for the sort-based algorithm I posted. Furthermore, in comparison to the errors incurred by those truly flawed algorithms, the errors incurred by numerically-limited algorithms like Fisher-Yates or the one I posted are entirely insignificant.

Update: Fixed minor typo/error above: the range 0..N - 1 mentioned towards the middle of the first paragraph was erroneously given as 1..N - 1 in the original post. Also, immediately before in the same sentence, the reference to the variable k was missing.

the lowliest monk

[reply]


XP is just a number
	PerlMonks