Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
BrowserUk,
When you say "Given an address as input, find any "similar" addresses in the DB" your not trying to (for example) locate next door neighbours, but rather locate duplicates with minor typos or transcription errors?

The latter (typos and transcription errors) with a caveat. If the input is '123 Main St' and the DB has a record of '356 Main Street', I would hope that the search wouldn't return "no results found". On the other hand, if the input was '12 Elm Ave' (two houses over), then the search should definately not find a match. Ultimately, I would like to return the best N matches ordered by degree of similarity.

Thanks for the pointer to Re^3: Comparing text documents. If I get real motivated, I will try it out.

Cheers - L~R


In reply to Re^2: Efficient Fuzzy Matching Of An Address by Limbic~Region
in thread Efficient Fuzzy Matching Of An Address by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-03-29 07:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found