But if the code is posted on their site, it should be correct. Sure it is slow, but still correct.
Would distributing it work? Well, how fast can each one be tested? Increment directly in the string rather than converting each time. Use a lookup table, so each byte's value maps to the next, to cover upper and lower case. Basically the overhead is trivial except for one pass through the MD5 block primitive.
But, that primitive is designed to be CPU intensive. One "F" uses 3 or 4 primitive operations, which map to CPU instructions. The accumulate step is about the same. Half the input is padding which does not change, but that doesn't help you. The beginning of the string does not change, so that does help. You can pre-load the state of the hashing to skip the first 7 of 64 operations. That still leaves 57, each of which requires about 8 machine primitives, each of which is dependant on the previous so your superscaler CPU turns into a simple one-at-a-time machine. So 500 "clocks" on a 2GHz processor is 4 million attempts per second per core.
If you get a million cores working on the problem via friends on the net, that's 213 attempts per second. 128−13 is 115, so 2115 seconds to run to completion, or roughly 1027 years. Or, in analogy with calling our distance from the sun 1 AU, I'll call the present elapsed time since the Big Bang one UTU (Universal Time Unit). The distributed problem will terminate in 960000000000000000 UTU.
So no, it won't help. Exponential time... doubling your effort only decrements the exponent by 1, so it's just a drop in the bucket.
It's possible to find a collision in under a minute, due to flaws in the algorithm. So perhaps an algorithmic approach is possible, to reduce the search space. That's the only way it will get done, short of advanced quantum computers.
—John
| [reply] |