Re: Number of values for each key in hash

There are a few gotchas in your code... let me modify it like this:

use strict;
use warnings;

my %GeneCount = ();

#open the textfile GeneType.txt
open (GENETYPE, "GeneType.txt") or die "Could not open file: '$!'";

my $header = <GENETYPE>;  # read the header before entering the loop

while (<GENETYPE>) {
    chomp;
    my ($GeneName, $GeneType)= split (/\t/, $_);
    $GeneCount{$GeneType}++;
}

for my $type (sort keys %GeneCount) {
    print "$type: $GeneCount{$type}\n";
}
[download]

So what did I change?

I started the program with use strict; and use warnings; which is a good habit and will save a lot of time in the long run. The only downside is that I now have to declare my %GeneCount = () before using it.
In the open statement I included the reason why it failed into the error message. There's also the opportunity to use the three-parameter form of open and a lexical file handle, which I let pass, because your code is correct (but slightly out of fashion).
Instead of removing the header in every line of the loop, I just read the header before even entering the loop.
I added chomp which kills the newline which will otherwise be at the end of every gene type you read.
Most important for your logic: I changed the hash so that the types are the keys, and the count are the values.

I seem to recall that older versions of Perl (I'm using 5.28) issued some warnings about uninitialized $GeneCount{pseudogene}. To get rid of these you can add the line no warnings "uninitialized" before entering the loop.

And that's it. The rest is just typing out the collected values.

If you are a beginner in Perl, you might also checkout https://learn.perl.org/books/: They are fun to read.

Comment on Re: Number of values for each key in hash Download Code

Replies are listed 'Best First'.
Re^2: Number of values for each key in hash by Anonymous Monk on Feb 29, 2020 at 14:21 UTC
Possible additional tweaks: There is no need to initialize `%GeneCount` to `()`. That is the value it takes on anyway when declared. You may want to cultivate the habit of using lexical variables as file handles (i.e. `open my $genetype, ... or die ...`). Bareword file handles are global. You may want to cultivate the habit of using three-argument opens (i.e. `open my $genetype, '<', 'GeneType.txt' or die ...`. This is the only way you can specify things like file encoding. Purely as a style thing, built-ins like `open()` and `split()` do not need parentheses, except for precedence. Whoever wrote perlopentut uses parentheses because that author also chose to use the tightly-binding `'\|\|'` operator rather than the loosely-binding `or` operator for error checking. None of these are required to make the presented script work.	[reply] [d/l] [select]
Re^2: Number of values for each key in hash by Sofie (Acolyte) on Feb 29, 2020 at 12:43 UTC
That works perfectly, thanks!	[reply]


Do you know where your variables are?
	PerlMonks