Is gobbling an entire file into an array considered bad form? . . .
One should always be aware of the efficiency concern. If you're sure the file will never be "too big", sluurping (as it's called) shouldn't be a problem. Otherwise, you'd do well to try to do per-record reading/processing, where practical.
Calin's solution is good. If you want a little extra efficiency, you can buy it with memory, i.e. data structures. In the solution below, we maintain a separate hash for those keys which are known to be duplicates. Then, at the end, we iterate only over that hash. This has a pay-off if the number of duplicate keys is significantly smaller than the total number of keys.
my( %keys, %dup );
while (<STDIN>)
{
chomp;
if ( /PROBABLECAUSE\w*\((\d+),\s*\w*,\s+(\w*)/ )
{
my( $id, $key ) = ( $1, $2 );
if ( exists $dup{$key} ) # already found to be a dup
{
push @{ $dup{$key} }, $id;
}
elsif ( exists $keys{$key} ) # only seen once before
{
push @{ $dup{$key} }, delete($keys{$key}), $id;
}
else # first time seen
{
$keys{$key} = $id;
}
# check if any key has init caps (not allowed)
if ( $key =~ /^[A-Z]\w*/ )
{
print "Id: $id - $key\n";
}
}
}
print "\nDuplicated keys:\n\n";
for my $key ( keys %dup )
{
print "Key: $key\n";
print "\tId: $_\n" for @{$dup{$key}};
}
(Not tested)