|Problems? Is your data what you think it is?|
Re: [OT] The statistics of hashing.by syphilis (Bishop)
|on Mar 31, 2012 at 23:13 UTC||Need Help??|
How to calculate those odds for each new hash, as the number of hashes already set into the vectors increases?
Update: Following figure is not right. (Corrected figure provided in following post.)
IIUC it's just (N/4294967296)**4, where N is the number of MD5 hashes that have already been entered into the bit vector.
But that's assuming that MD5 hashes distribute evenly, and I don't know if that has been established (or disproved). If they don't distribute evenly, then the odds of hitting a false positive will increase.
Not sure what affect your "Bloom Filters" variation would have.