Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: how to process very large tab limited pileup file

by gudluck (Novice)
on Oct 01, 2010 at 21:31 UTC ( [id://863016]=note: print w/replies, xml ) Need Help??


in reply to Re: how to process very large tab limited pileup file
in thread how to process very large tab limited pileup file

My failed attempt:
my $countref = ($t[8] =~ tr/.,//); my $ratioo = $countref/$t[7] ; if ($countref/$t[7] >0.95) { # print "$t[1]\t$t[2]\t$t[3]: $ratioo \n"; $seq .=$t[2]; } elsif ($countref/$t[7] <=0.95){ $t[8] =~ s/[.,]/$t[2]/g; my $counta = ($t[8] =~ tr/aA//); my $countt = ($t[8] =~ tr/tT//); my $countg = ($t[8] =~ tr/gG//); my $countc = ($t[8] =~ tr/cC//); my ($ratioa, $ratiot, $ratiog, $ratioc, $rat); my @rat = (); $ratioa = $counta/$t[7]; $ratiot = $countt/$t[7]; $ratiog = $countg/$t[7]; $ratioc = $countc/$t[7]; print " $t[1] : A $ratioa T $ratiot G $ratiog C $ratioc "; my $rrat = \@rat; push (@rat, "$ratioa", "$ratiot", "$ratiog", "$ratioc"); my $countnut = 0; my $i; my @ncode; my $ccode; for $i(0..$#rat){ if ( $rrat->[$i] >= 0.05){ $i++; push (@ncode, $i); $countnut ++; $i--; } $ccode = join ("", @ncode); if ($countnut >= 3) { # print "N\n"; $seq .= "N"; } #print " code:$ccode count $countnut\n"; else { my %hash; %hash = {1, 'A', 2, 'T', 3, 'G', 4, 'C', 12, 'W', 21, 'W', 13, 'R', 31, 'R', 14, 'M', 41, 'M', 23, 'K', 32, 'K', 24, 'Y', 42, 'Y', }; print "cc", $hash{$ccode},"\n" ; } } }

Replies are listed 'Best First'.
Re^3: how to process very large tab limited pileup file
by graff (Chancellor) on Oct 02, 2010 at 17:33 UTC
    Like the other monks, I don't have a clue what your goals really are. (Try using shorter sentences when explaining things.) I took your code and data as posted, and put them together into a single file like this:
    #!/usr/bin/perl while (<DATA>) { next if (/^#/ or /^\s*$/); chomp; @t = split /\t/; # your code goes here } __DATA__ # your data goes here
    (update: I initially forgot to mention "split")

    I had to edit the data to make it tab-delimited -- maybe the tabs were converted to spaces because of the way you posted it (or the way I downloaded it), but you may need to check and make sure whether your file really has the right number of tabs per line.

    Anyway, your code ran and produced output with no errors. So what's the problem? If you were expecting some different sort of output, you'll need to explain how the actual output differs from the desired output.

    I took the liberty of fixing the indentation in the code -- this really helps for making the code readable, and a good editor should make it easy to use proper indentation (e.g. emacs, vi, ...)

    I also did some refactoring, to remove unnecessary variables, make the output more informative, and use loops where possible.

    I couldn't tell what you were trying to do with your "$seq" variable... that (and the output format) is the only place where my version does something different from yours -- but I'm just guessing about what you really want.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://863016]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-20 02:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found