Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Fuzzy matching of text strings

by srdst13 (Pilgrim)
on Dec 14, 2005 at 18:11 UTC ( [id://516714]=note: print w/replies, xml ) Need Help??


in reply to Re: Fuzzy matching of text strings
in thread Fuzzy matching of text strings

Thanks all for the answers. Finding the similarity between two strings is one (probably the largest) component of my problem. However, there is another component--finding groups of "matches". I guess that I could do all possible pairs and look for similarity between them, forming a graph-like structure connecting "matches" to each other and then look for disconnected components or some such thing. Any thoughts on this second part of the problem? There are any number of possible ways to do it in practice (Graph.pm or even SQL could probably handle it), but it would be great to hear thoughts on the issue.

Thanks again,
Sean

Replies are listed 'Best First'.
Re^3: Fuzzy matching of text strings
by ruoso (Curate) on Dec 16, 2005 at 17:15 UTC
    In fact, the process of developing each of the test subroutines was based on the results of the comparision using a subset of the data. What I did, in that case, was continuosly creating new tests and outputting to a csv file A, B and the comparision score. I stopped when I got a good result of both a limit score and having few false positives and false negatives. I think you could do it in the same way, no need for anything much sofisticated, just a subset of the database and many runs improving the type of tests you make.
    daniel

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://516714]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-04-23 06:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found