laziness, impatience, and hubris | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
When i was looking at the postal regulations for my periodicals license for The Perl Review, I discovered that there is a whole industry that does just this: give them a list and they give it back to you with duplicates removed and addresses cleaned-up. This service comes as a true service (someone else does the work) or off-the-shelf software. Depending on how much work you have to do and how much it is worth to the company, you might want to skip doing this yourself. However, if you program it yourself, part of the solution (along with what people have already mentioned) is getting the canonical address. People will often give you their version of their address (for instance, I can't remember if my street is a Road or an Avenue, so I use them interchangeably). The US Postal Service has all sorts of data and tools to help you figure out what it should be based on the zip code. Other post offices have similar things (the Royal Mail address lookup kicks ass). From there you get closer to finding the duplicates than just looking at the address the customer gave you or someone keyed in. Good luck and let us know how it turns out.
-- brian d foy <bdfoy@cpan.org> In reply to Re: De Duping Street Addresses Fuzzily
by brian_d_foy
|
|