http://qs321.pair.com?node_id=515509


in reply to Building "islands" of related data

ok, so, if performance, speed, and anything good is not a problem to trample on, how about this:
#!/usr/bin/perl -w use strict; use Data::Dumper; use Digest::MD5 qw(md5_hex); # I want to group domains together if they share at least two of those + values. # The 1st two entires match, the 3rd not my %domains = ( 'foo.com' => { phone => '1234', company => 'big co', contact => 'john', address => 'cool street', fax => '5678', email => 'john@bigco.com', }, 'bar.com' => { phone => '2222', company => 'small co', contact => 'jeff', address => 'bad street', fax => '5678', email => 'john@bigco.com', }, 'baz.com' => { phone => '9999', company => 'another co', contact => 'judy', address => 'nasty street', fax => '8888', email => 'frank@anotherco.com', }, ); # Build permutation hash foreach my $domain (keys %domains) { # Iterate over all the elements in the domain hash foreach my $o_attr (keys %{$domains{$domain}}){ # As we're creating hashes of each permutation we need to iter +ate # over the same elements foreach my $i_attr (keys %{$domains{$domain}}){ $domains{$domain}{hash}{$o_attr}{$i_attr} = md5_hex(${$dom +ains{$domain}}{$o_attr}.${$domains{$domain}}{$i_attr}); } } } print Dumper(%domains);

You can then loop over all the permutations looking for hashes that match!