Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Natural Language Index Stemming

by PetaMem (Priest)
on Jun 18, 2002 at 11:23 UTC ( [id://175340]=note: print w/replies, xml ) Need Help??


in reply to Natural Language Index Stemming

Aaah my lovely favourite subfield of interest...

first off, you can diferenciate between knowledge based stemming algorithms and probabilistic stemming. And of course there is a bunch of heuristic mixture of these two aproaches spread all over the literature and the web. If you want something "not so good, but good enough and not expensive", you could use the next generation of old stemmer. See Snowball. Snowball is quite ok, especially because there are descriptions for more languages. However you never will be able to gain 100% accuracy with this approach, as only a dictionary of a given lang together with morphology knowledge will give you best (but still ambiguous) results.

But this requires heavy duty hardware, where heavy duty software can run on...

Bye
 PetaMem

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://175340]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-26 04:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found