in reply to De Duping Street Addresses Fuzzily
There are severe problems to identify typos. You might want to use Text::Levenshtein to identify typos. But I really do not know if this works. The module implements the Levenshtein edit distance, a measure of the degree of proximity between two strings. The distance is the number of substituations, deletions or insertions (edits) needed to transform one string into the other one (and vice versa). Of course, you can use this after having cleansed the data, only (fifth => 5th, etc.).
|