Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Error in Code: Need help

by angerusso (Novice)
on Mar 28, 2012 at 22:30 UTC ( [id://962275]=perlquestion: print w/replies, xml ) Need Help??

angerusso has asked for the wisdom of the Perl Monks concerning the following question:

Dear perlmonkers, Could you point out what's wrong with my code? I want to get for each gene, the maximum value of the associated frequency printed in the 3rd column, but it gives wrong result (e.g. MUC2), see output below

Frequency GENE 20 SEMA4G 80 CTBP2 80 CTBP2 80 CTBP2 100 SPRN 40 SVIL 80 TIMM23 80 MINPP1 80 BTAF1 20 MUC2 60 MUC2 80 MUC2 100 MUC2 60 OR10G9 60 C11orf91 80 OR4S1 40 OR4C3 40 OR4C3 20 OR4C3 40 OR4C45 60 OR4C45
$file1 = $ARGV[0]; %hashfreq = (); $freq = 0; $geneold = "NA"; open(INPUTR,"<$file1") || die "Can't open \$file1 for reading.\n"; while($line=<INPUTR>){ chomp $line; @toks = split(/\t/, $line); $gene = $toks[1]; if ($gene =~ /^$geneold$/){ if ($toks[0] >= $numold){ $freq = $toks[0]; print $toks[0]."\t".$numold."\t".$freq."\t".$gene."\n"; } else { $freq = $numold; } $hashfreq{$toks[0]} = $freq; } else { $hashfreq{$toks[1]} = $toks[0]; } $numold = $toks[0]; $geneold = $toks[1]; } close(INPUTR); open(OUTD1,">output.txt"); open(INPUTR,"<$file1") || die "Can't open \$file1 for reading.\n"; while($line=<INPUTR>){ chomp $line; @toks = split(/\t/, $line); $idgene = $toks[1]; if (exists $hashfreq{$idgene}){ print OUTD1 $line."\t".$hashfreq{$idgene}."\n"; print $line."\t".$hashfreq{$idgene}."\n"; } }
Frequency GENE MAX 20 SEMA4G 20 80 CTBP2 80 80 CTBP2 80 80 CTBP2 80 100 SPRN 100 40 SVIL 40 80 TIMM23 80 80 MINPP1 80 80 BTAF1 80 20 MUC2 20 60 MUC2 20 80 MUC2 20 100 MUC2 20 60 OR10G9 60 60 C11orf91 60 80 OR4S1 80 40 OR4C3 40 40 OR4C3 40 20 OR4C3 40 40 OR4C45 40 60 OR4C45 40

Replies are listed 'Best First'.
Re: Error in Code: Need help
by tangent (Parson) on Mar 28, 2012 at 23:13 UTC
    A more general error is that you seem to be doing far more than you need to. Try replacing your first while loop with:
    while($line=<INPUTR>){ chomp $line; my ($freq,$gene) = split(/\t/, $line); next if (exists $hashfreq{$gene} and $hashfreq{$gene} >= $freq); $hashfreq{$gene} = $freq; }
    Output:
    20 SEMA4G 20 80 CTBP2 80 80 CTBP2 80 80 CTBP2 80 100 SPRN 100 40 SVIL 40 80 TIMM23 80 80 MINPP1 80 80 BTAF1 80 20 MUC2 100 60 MUC2 100 80 MUC2 100 100 MUC2 100 60 OR10G9 60 60 C11orf91 60 80 OR4S1 80 40 OR4C3 40 40 OR4C3 40 20 OR4C3 40 40 OR4C45 60 60 OR4C45 60

      Thanks so much for teaching this SUPERCOOL way to code the same thing. I will keep this in mind next time for similar situations that arise. THANKS a TRILLION!

Re: Error in Code: Need help
by Util (Priest) on Mar 28, 2012 at 23:11 UTC
    The [0] should be [1] in this line:$hashfreq{$toks[0]} = $freq;

      Thanks SOOOO MUCH for your reply. My last reply somehow didn't get posted. I hate myself for missing this stupid error as I am tired. Sorry about it.

Re: Error in Code: Need help
by Util (Priest) on Mar 28, 2012 at 23:27 UTC

    You can also make your code more robust by using the existence of the gene in the has instead of keeping "old" copied of the variables. This will allow your code to do-the-right-thing even when genes are in non-contiguous lines.

    ++tangent, who I see beat me to it.

    #!perl use strict; use warnings; use autodie; my $out_path = 'output.txt'; my @lines = <>; chomp @lines; my %gene_maxfreq; for (@lines) { my ( $freq, $gene ) = split; my $max_freq = $gene_maxfreq{$gene}; if ( (!defined $max_freq) or $max_freq < $freq ) { $gene_maxfreq{$gene} = $freq; } } open my $out_fh, '>', $out_path; for (@lines) { my ( $freq, $gene ) = split; my $max_freq = $gene_maxfreq{$gene}; if ( defined $max_freq ) { my $line = join "\t", $freq, $gene, $max_freq; print {$out_fh} "$line\n"; print "$line\n"; } } close $out_fh;

      THANKS SO MUCH for your another SUPERCOOL code and help. Very much appreciated.!!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://962275]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-19 05:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found