Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Extracting common keys present in multiple files

by tangent (Parson)
on Oct 27, 2015 at 20:29 UTC ( [id://1146173]=note: print w/replies, xml ) Need Help??


in reply to Extracting common keys present in multiple files

Here is a way to do it using two hashes, %n_count to keep track of the number of times each name appears, and %f_count to keep track of the files in which each occurs.
my (%f_count,%n_count); for my $file (@files) { open( my $fh, '<', $file ) or die "Couldn't read '$file': $!"; while (my $line = <$fh>) { chomp $line; next unless $line; my ($name,$number) = split(',',$line); $n_count{$name}++; $f_count{$name}{$file}++; } }
Then you need to extract the names - first get the names that appear at least 25 times in the %n_count hash, then, for each of those candidate names, get the ones that appear in all files.
my $num_of_files = scalar @files; my $min = 25; my @candidates = grep { $n_count{$_} >= $min } keys %n_count; for my $name (@candidates) { my $in_files = scalar keys %{ $f_count{$name} }; next unless $in_files == $num_of_files; print "$name\n"; }
The only tricky bit here is the scalar keys %{ $f_count{$name} }
$f_count{$name} is a hash reference, where each key is a file name. We can get at the keys by dereferencing the hash %{...} and counting how many there are. If that count equals the number of files then that name is in every file.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1146173]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-18 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found