Re: Find duplicate values in hash

by Nkuvu (Priest)
on Apr 10, 2009

in reply to Find duplicate values in hash

I'd be keeping track of the keys when you're assigning to the definition hash. That is, keep a separate hash with duplicate keys. There is probably a more efficient way to do this, but some random puttering around while I'm waiting for my work script to finish:

#!/usr/bin/perl use strict; use warnings; my (%hash, %dup_hash); # Minor tweak to read from DATA rather than a file while (my $line = <DATA>) { chomp($line); my ($enu, $deu) = split /\t/, $line; $hash{$enu} = $deu; # Keep a list of all duplicate values push @{$dup_hash{$deu}}, $enu; } for my $key (keys %hash) { print "$key\n"; } for my $value (values %hash) { print "$value\n"; } print "\nDuplicate definitions:\n"; for my $deu (keys %dup_hash) { if (scalar @{$dup_hash{$deu}} > 1) { for my $en (@{$dup_hash{$deu}}) { print "$deu => $en\n"; } print "\n"; } } __DATA__ Retire a document Dokument deaktivieren Remove a document from the knowledge base Dokument aus der Knowledg +e Base entfernen Promote document retirement Dokument deaktivieren Document Expired Dokument abgelaufen

Gives the output:

Remove a document from the knowledge base Document Expired Promote document retirement Retire a document Dokument aus der Knowledge Base entfernen Dokument abgelaufen Dokument deaktivieren Dokument deaktivieren Duplicate definitions: Dokument deaktivieren => Retire a document Dokument deaktivieren => Promote document retirement

Edit: Renamed some of the variables to accurately reflect their contents. Second edit Pretty much the same thing as what JavaFan has, just different syntactical approach.

