http://qs321.pair.com?node_id=962806


in reply to [OT] The statistics of hashing.

How to calculate those odds for each new hash, as the number of hashes already set into the vectors increases?

Update: Following figure is not right. (Corrected figure provided in following post.)

IIUC it's just (N/4294967296)**4, where N is the number of MD5 hashes that have already been entered into the bit vector.
But that's assuming that MD5 hashes distribute evenly, and I don't know if that has been established (or disproved). If they don't distribute evenly, then the odds of hitting a false positive will increase.

Not sure what affect your "Bloom Filters" variation would have.

Cheers,
Rob