I vote for:
- De-duping the existing data,
- Discovering how dupe data got in there in the first place, and,
- Debugging discoveries from #2 so it never happens again.
Depending on your application now, and in the future, duplicate records could become a real nightmare. It sounds like you have reasoned it through and the dupes don't represent a problem, so maybe I'm off base. But after having a problem with 100,000+ dupe transactions getting dumped into a multi-client ecommerce database by renegade code from a know-it-all coder after an unscripted production install... I guess I have issues....