For fun & excercise, i am writing an application that recursively checks the links in an HTML document on a webserver.
If the URLs it has encountered are stored, a link only has to be checked once so i want to store the URLs for the sole purpose of seeing whether i encountered them before.
This could be done using a sorted array (lots of mem, binary search, O(log N)), a hash (lots of mem, O(1)) or a
Bloom Filter (low mem, false positives, O(1)).
I think the Bloom Filter is the best option here. Does anyone have a different/better option?