http://qs321.pair.com?node_id=1198138


in reply to Re: Find duplicate based on specific fields while allowing 2 mismatch
in thread Find duplicate based on specific fields while allowing 2 mismatch

I think, I need to take the first entry as the reference and allow the two possible mismatch using first line's UMI tag to make one cluster. Remaning lines at the same start positions, can be looped again similarly. So, if the first line have AAAA then AAGG or TTAA, etc can be merged into single cluster, But, TTGG will make separate cluster. I have edited the question for the same! Amit