note
syphilis
Let's look at the probability of getting "at least one dup" (instead of "exactly one dup").<br>
Let's also initially deal with the case where we're selecting (at random) only one number (instead of 4 or 10) each time.<br><br>
Let P(0) be the probability that the very first selection did not produce a duplicate:<br>
P(0) = (4294967295/4294967296)**0 # == 1, obviously<br><br>
Let P(1) be the probability that the second selection did not produce a duplicate:<br>
P(1) = (4294967295/4294967296)**1<br><br>
Let P(2) be the probability that the third selection did not produce a duplicate:<br>
P(2) = (4294967295/4294967296)**2<br><br>
and so on:<br>
Let P(1e9 + 1) be the probability that the 1000000001st selection did not produce a duplicate:<br>
P(1e9) = (4294967295/4294967296)**1e9<br><br>
(In general terms, P(x-1) is simply the probability that none of the x-1 selections already made match the xth selection.)<br><br>
Then the probability that we can make 1000000001 random selections in the range (1 .. 4294967296) and get zero duplicates is<br>P(0)*P(1)*P(2)*P(3)*...*P(1e9).<br>That equates to (4294967295/4294967296)**Z, where<br>Z = 0+1+2+3+...+1e9.<br><br>So, the probablility D that we can make 1000000001 selections and have at least 1 duplicate is<br>D = 1 - ((4294967295/4294967296)**Z)<br><br>If we're doing that 4-at-a-time, then we need to calculate D**4; doing it 10-at-a-time we calculate D**10.<br><br>Is that sane ? Does it produce sane results ? (I think it should, but I don't have time to check.)<br><br><b>10-MINUTES LATER AFTERTHOUGHT: </b>I don't think the "D**4" and "D**10" calculations actually tell us what we want ... gotta think about it a bit more ...<br><br>Cheers,<br>Rob
962802
962925