Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Perl and Linguistics

by graff (Chancellor)
on May 26, 2002 at 03:28 UTC ( [id://169330]=note: print w/replies, xml ) Need Help??


in reply to Perl and Linguistics

... extending to more general linguistic modelling, ideally from a non-language-specific basis that can be adapted to different languages.

That's ambitious... but worth pursuing. The first thing that comes to my mind is (Hidden) Markov modelling, which has been demonstrated to do a decent job of drawing plausible "morphological" boundaries in a stream of text data in any given language. It appears that there are Markov modules on CPAN, but whether these are suitable to the task of language analysis is more than I know at present.

I do know that Perl is quite useful for handling a lot of "infrastructure" work relating to the management and handling of language data; e.g. developing and searching a lexicon, locating and displaying/highlighting tokens in a text stream, mapping across character encodings, etc. Of course, a lot of useful tools have already been developed (some in Perl, some in C(++)) -- check the archives at (and/or join) the CORPORA mailing list: http://www.hit.uib.no/corpora/

I'm sorry I can't give you any more detailed pointers or advice, but I hope this helps a little.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://169330]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2024-04-23 11:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found