Re: Extracting common keys present in multiple files

Here is a way to do it using two hashes, %n_count to keep track of the number of times each name appears, and %f_count to keep track of the files in which each occurs.

my (%f_count,%n_count);

for my $file (@files) {
    open( my $fh, '<', $file ) or die "Couldn't read '$file': $!";
    
    while (my $line = <$fh>) {
        chomp $line;
        next unless $line;
        my ($name,$number) = split(',',$line);
        $n_count{$name}++;
        $f_count{$name}{$file}++;
    }
}
[download]

Then you need to extract the names - first get the names that appear at least 25 times in the %n_count hash, then, for each of those candidate names, get the ones that appear in all files.

my $num_of_files = scalar @files;
my $min = 25;
my @candidates = grep { $n_count{$_} >= $min } keys %n_count;

for my $name (@candidates) {
    my $in_files = scalar keys %{ $f_count{$name} };
    next unless $in_files == $num_of_files;
    print "$name\n";
}
[download]

The only tricky bit here is the scalar keys %{ $f_count{$name} }
$f_count{$name} is a hash reference, where each key is a file name. We can get at the keys by dereferencing the hash %{...} and counting how many there are. If that count equals the number of files then that name is in every file.

Comment on Re: Extracting common keys present in multiple files Select or Download Code


laziness, impatience, and hubris
	PerlMonks