good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Text::Levenshtein would give you a numerical way of comparing two strings, but requires you to compare the new string against each of the tests strings each time and isn't quick. Probably the best way would be to create an inverted index of the words (or preferable the stems) against the DB phrases and then look each word (or stem) in the new phrase against this index. This gives you a count of the number of common words between the new phrase and the DB phrases. Sort those highest first and you have the most likely candidates for your further examination. I don't know of a module that does this, but parts of it (the inversion, stemming etc.) could be done with various modules. Sounds like a fun project. Good luck:) Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller If I understand your problem, I can solve it! Of course, the same can be said for you. In reply to Re: calculate matching words/sentence
by BrowserUk
|
|