http://qs321.pair.com?node_id=1227058


in reply to multiple hash compare, find, create

I don't know how well this would scale on 14MB sets, but this might be a possible solution using Set::Scalar.

Yes, Eily had the most efficient solution. There's no need to create the 3 sets :-(

use Set::Scalar; my %hash_4; my $set_1 = Set::Scalar->new(keys %hash_1); my $set_2 = Set::Scalar->new(keys %hash_2); my $set_3 = Set::Scalar->new(keys %hash_3); my $intersect = $set_1 * $set_2 * $set_3; for my $key (@$intersect) { push @{$hash_4{$key}}, $hash_1{$key}, $hash_2{$key}, $hash_3{$key} +; }

Replies are listed 'Best First'.
Re^2: multiple hash compare, find, create
by supertaco (Novice) on Dec 10, 2018 at 19:38 UTC

    Hi Cristoforo. I'm trying it anyway :) I appreciate your help! Will look at Eily's solution, too.

    [root@WarGame2 BRRX]# time ./a.pl 26662441 xre hash size -- 26662441 xre hash keys 14700586 rdk hash size -- 14700586 rdk hash keys 23405918 bid hash size -- 23405918 bid hash keys doing the set scalar stuff
Re^2: multiple hash compare, find, create
by supertaco (Novice) on Dec 11, 2018 at 13:26 UTC

    Hey Cristoforo, I implemented your solution (without the strikethroughs :)) and it worked like a dream and creates some flexibility for future extensions. I will look into implementing the other solution as I have time. But thanks again!

    print "doing the set scalar stuff\n"; use Set::Scalar; %output; $ridxre = Set::Scalar->new(keys %ridxre); $ridrdk = Set::Scalar->new(keys %ridrdk); $ridbid = Set::Scalar->new(keys %ridbid); $intersect = $ridxre * $ridrdk * $ridbid; for $key (@$intersect) { push @{$output{$key}}, $ridxre{$key}, $ridrdk{$key}, $ridbid{$key} +; } print "writing OUT.txt file\n"; open(OUT,">","$outfile1") || die("cannot open $outfile1\n"); while ( ($k,$v) = each %output ) { print OUT "$k "; print OUT "$_ " for @{ $output{$k} }; print OUT "\n"; } close(OUT);