Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Let's look at the probability of getting "at least one dup" (instead of "exactly one dup").
Let's also initially deal with the case where we're selecting (at random) only one number (instead of 4 or 10) each time.

Let P(0) be the probability that the very first selection did not produce a duplicate:
P(0) = (4294967295/4294967296)**0 # == 1, obviously

Let P(1) be the probability that the second selection did not produce a duplicate:
P(1) = (4294967295/4294967296)**1

Let P(2) be the probability that the third selection did not produce a duplicate:
P(2) = (4294967295/4294967296)**2

and so on:
Let P(1e9 + 1) be the probability that the 1000000001st selection did not produce a duplicate:
P(1e9) = (4294967295/4294967296)**1e9

(In general terms, P(x-1) is simply the probability that none of the x-1 selections already made match the xth selection.)

Then the probability that we can make 1000000001 random selections in the range (1 .. 4294967296) and get zero duplicates is
P(0)*P(1)*P(2)*P(3)*...*P(1e9).
That equates to (4294967295/4294967296)**Z, where
Z = 0+1+2+3+...+1e9.

So, the probablility D that we can make 1000000001 selections and have at least 1 duplicate is
D = 1 - ((4294967295/4294967296)**Z)

If we're doing that 4-at-a-time, then we need to calculate D**4; doing it 10-at-a-time we calculate D**10.

Is that sane ? Does it produce sane results ? (I think it should, but I don't have time to check.)

10-MINUTES LATER AFTERTHOUGHT: I don't think the "D**4" and "D**10" calculations actually tell us what we want ... gotta think about it a bit more ...

Cheers,
Rob

In reply to Re^6: [OT] The statistics of hashing. by syphilis
in thread [OT] The statistics of hashing. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others contemplating the Monastery: (6)
    As of 2020-10-31 16:52 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      My favourite web site is:












      Results (290 votes). Check out past polls.

      Notices?