There are a few gotchas in your code... let me modify it like this:
use strict;
use warnings;
my %GeneCount = ();
#open the textfile GeneType.txt
open (GENETYPE, "GeneType.txt") or die "Could not open file: '$!'";
my $header = <GENETYPE>; # read the header before entering the loop
while (<GENETYPE>) {
chomp;
my ($GeneName, $GeneType)= split (/\t/, $_);
$GeneCount{$GeneType}++;
}
for my $type (sort keys %GeneCount) {
print "$type: $GeneCount{$type}\n";
}
So what did I change?
- I started the program with use strict; and use warnings; which is a good habit and will save a lot of time in the long run. The only downside is that I now have to declare my %GeneCount = () before using it.
- In the open statement I included the reason why it failed into the error message. There's also the opportunity to use the three-parameter form of open and a lexical file handle, which I let pass, because your code is correct (but slightly out of fashion).
- Instead of removing the header in every line of the loop, I just read the header before even entering the loop.
- I added chomp which kills the newline which will otherwise be at the end of every gene type you read.
- Most important for your logic: I changed the hash so that the types are the keys, and the count are the values.
I seem to recall that older versions of Perl (I'm using 5.28) issued some warnings about uninitialized $GeneCount{pseudogene}. To get rid of these you can add the line no warnings "uninitialized" before entering the loop.
And that's it. The rest is just typing out the collected values.
If you are a beginner in Perl, you might also checkout https://learn.perl.org/books/: They are fun to read.