Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Number of values for each key in hash

by haj (Vicar)
on Feb 29, 2020 at 12:26 UTC ( [id://11113571]=note: print w/replies, xml ) Need Help??


in reply to Number of values for each key in hash

There are a few gotchas in your code... let me modify it like this:
use strict; use warnings; my %GeneCount = (); #open the textfile GeneType.txt open (GENETYPE, "GeneType.txt") or die "Could not open file: '$!'"; my $header = <GENETYPE>; # read the header before entering the loop while (<GENETYPE>) { chomp; my ($GeneName, $GeneType)= split (/\t/, $_); $GeneCount{$GeneType}++; } for my $type (sort keys %GeneCount) { print "$type: $GeneCount{$type}\n"; }

So what did I change?

  • I started the program with use strict; and use warnings; which is a good habit and will save a lot of time in the long run. The only downside is that I now have to declare my %GeneCount = () before using it.
  • In the open statement I included the reason why it failed into the error message. There's also the opportunity to use the three-parameter form of open and a lexical file handle, which I let pass, because your code is correct (but slightly out of fashion).
  • Instead of removing the header in every line of the loop, I just read the header before even entering the loop.
  • I added chomp which kills the newline which will otherwise be at the end of every gene type you read.
  • Most important for your logic: I changed the hash so that the types are the keys, and the count are the values.

I seem to recall that older versions of Perl (I'm using 5.28) issued some warnings about uninitialized $GeneCount{pseudogene}. To get rid of these you can add the line no warnings "uninitialized" before entering the loop.

And that's it. The rest is just typing out the collected values.

If you are a beginner in Perl, you might also checkout https://learn.perl.org/books/: They are fun to read.

Replies are listed 'Best First'.
Re^2: Number of values for each key in hash
by Anonymous Monk on Feb 29, 2020 at 14:21 UTC

    Possible additional tweaks:

    • There is no need to initialize %GeneCount to (). That is the value it takes on anyway when declared.
    • You may want to cultivate the habit of using lexical variables as file handles (i.e. open my $genetype, ... or die ...). Bareword file handles are global.
    • You may want to cultivate the habit of using three-argument opens (i.e. open my $genetype, '<', 'GeneType.txt' or die .... This is the only way you can specify things like file encoding.
    • Purely as a style thing, built-ins like open() and split() do not need parentheses, except for precedence. Whoever wrote perlopentut uses parentheses because that author also chose to use the tightly-binding '||' operator rather than the loosely-binding or operator for error checking.

    None of these are required to make the presented script work.

Re^2: Number of values for each key in hash
by Sofie (Acolyte) on Feb 29, 2020 at 12:43 UTC
    That works perfectly, thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11113571]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-20 03:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found