Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Fuzzy matching of text strings

by ww (Archbishop)
on Dec 14, 2005 at 14:35 UTC ( [id://516657]=note: print w/replies, xml ) Need Help??


in reply to Fuzzy matching of text strings

Above represent good advice, but it may be profitable (efficient) to normalize the capitalization before dealing with the more complex fuzzy matching needs found in item 3.

I would be seriously inclined to see if lc'ing everything, and then uc'ing first letter of each word minimizes the work.

However, this scheme is suggested on the basis of one snippet of your data; if you have to distinguish between Mr. MacHinery and (something) Machinery *OR* if capitalization on the output need be not only consistent but also "correct" -- for unknown values of correct -- you will need something far better than this simple-minded scheme.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://516657]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-25 16:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found