Re: OT: How to find anagrams?

One interesting approach I've considered, but never implemented, is to assign a prime number to each letter, and then hash the target word to the product of the values of its letters. The hash function for any word then divides the target product if and only if the letters of the word are a partial anagram - tricksy things like repeating letters are taken care of automatically. Then you just search for combinations of words whose hashes multiply together to give the target value; and if you calculate the hashes over the whole dictionary in advance and store in sorted order I think this could give pretty good performance.

For the standard 26 letter alphabet you'll need primes 2 .. 101, and you're going to need BigInts which will slow things down a bit (but not too badly if you use it with a fast maths package such as Math::Pari or Math::GMP). You can reduce the size of the numbers a bit further by assigning the lowest primes to the most frequently occurring letters.

Hugo

Comment on Re: OT: How to find anagrams?


Welcome to the Monastery
	PerlMonks