Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: The (futile?) quest for an automatic paraphrase engine

by andyf (Pilgrim)
on May 17, 2004 at 06:00 UTC ( [id://353875]=note: print w/replies, xml ) Need Help??


in reply to The (futile?) quest for an automatic paraphrase engine

If you want GPNLP , its a _hard_ problem (as the above posts say). You could leverage the commonality in the output format, realise you are looking for a far smaller sentence subset than a general purpose NLP system. This way you have a hope of practically doing it, of course the method will always be brittle, but as you say you have module Carbon::Life::Mammal::Human to help postprocess.
1) you are only trying to parse _relationships_
2) Each relationship you are looking for is either an ISA or HASA relationship.
3) all final relationships are of the binary form x R y where R is the relationship between x and y
My 'heuristic beard stroking algorithm' woud be
1) partition the whole token set into Entities and Relationships. Do this by pulling out all the proper nouns to start with.
2) find and deconstruct the non trivial compound entities to remove qualifiers and break open sets such as 'Three other cities, x, y and z'
3) Apply simple set math to setermine the membership of each entity foreach relationship.
The biggest challenge you might have is moving from n->1 to n->n relationships. Its easy if everything has just one relationship, but Seoul being both Koreas capital and a city with a >10.2M population is the stumbler imho. Don't forget to account for unary attributes (Seoul is rainy) which don't involve another entity. As you say you have looked at some NLP, go back and read read read and there wil be an answer lurking in here somewhere. Just don't try and generalise the problem too much or it will explode, the best way to practical NLP, is to cheat. :) good luck,
Andy
  • Comment on Re: The (futile?) quest for an automatic paraphrase engine

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://353875]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-20 02:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found