in reply to [OT] The statistics of hashing.

*How to calculate those odds for each new hash, as the number of hashes already set into the vectors increases?*

**Update:**Following figure is not right. (Corrected figure provided in following post.)

IIUC it's just (N/4294967296)**4, where N is the number of MD5 hashes that have already been entered into the bit vector.

But that's assuming that MD5 hashes distribute evenly, and I don't know if that has been established (or disproved). If they don't distribute evenly, then the odds of hitting a false positive will increase.

Not sure what affect your "Bloom Filters" variation would have.

Cheers,

Rob

In Section
Seekers of Perl Wisdom