good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
Re: Natural Language Index Stemmingby simon.proctor (Vicar) |
on Jun 18, 2002 at 07:44 UTC ( [id://175299]=note: print w/replies, xml ) | Need Help?? |
I used Paice Husk stemming for my search engine and used MLDBM and Storable for creating the index. I also used a second index to cache the HTML meta data.
I quite liked Paice Husk as it translated to Perl very easily. I just had to keep the rules in an array and reverse all fragments of my search terms. If you want an alternative to Lingua::Stem then I seriously recommend it. You can find the paper here They also give an (old) Perl example which should help provide a basis of your app if you choose to try it.
In Section
Seekers of Perl Wisdom
|
|