Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Extracting structured data from unstructured text - just how difficult would this be?

by clinton (Priest)
on Feb 21, 2008 at 16:51 UTC ( [id://669314]=note: print w/replies, xml ) Need Help??


in reply to Re: Extracting structured data from unstructured text - just how difficult would this be?
in thread Extracting structured data from unstructured text - just how difficult would this be?

I was thinking about something along exactly these lines, so we may just be two talking @$$'$

What'd be interesting is trying to look for "contextual words", so does May refer to the month or the daughter, London is a place, or Jack London. It would be impossible to predict all of these ambiguities, so the "training" makes a lot of sense to me.

Of course, you will never achieve 100% accuracy but I don't think you want to.

Absolutely correct - we don't depend on this data, it just adds value when we can extract it.

thanks for the input

Clint

  • Comment on Re^2: Extracting structured data from unstructured text - just how difficult would this be?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://669314]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2024-04-20 03:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found