http://qs321.pair.com?node_id=1198136


in reply to Find duplicate based on specific fields while allowing 2 mismatch

Can you please clarify your second requirement? The relation "two mismatches" is not transitive, e.g. there are two mismatches between AAAA and AAGG and two mismatches between AAGG and TTGG but four mismatches between AAAA and TTGG. Would you consider all three to belong to one cluster?