Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: First foray into Perl

by dorko (Prior)
on Mar 24, 2014 at 18:01 UTC ( #1079571=note: print w/replies, xml ) Need Help??


in reply to First foray into Perl

Hello,

I might be wrong, but I get the feeling this is more of a biology thing for you than it is a Perl thing, so I'm helping out more than I usually do. Please make sure you understand the code below (and its limitations) before you use it.

Good luck.

use strict; use warnings; # Read the first 7 lines of metadata. # Assuming there are always 7 lines of metadata. foreach (1..7) { # Read a line of data. my $header_data = <DATA>; # Remove the end of line character. chomp $header_data; # Split the string into 2 parts, using white space as a separator. my ($lable, $string) = split /\s+/, $header_data, 2; # only pay attention to the "Motif" line. next if ($lable ne 'Motif'); print "$string "; } # Process the next 10 lines of data. # Assuming there are always 10 lines of data. foreach (1..10) { # Declare a variable to hold the data in the file. my %base_pairs; # Read a line of data. my $line = <DATA>; # Remove the end of line character. chomp $line; # Split the string into 5 parts, using whitespace as a separator. # Assuming the Position is always in the same order in the file. (undef, $base_pairs{A}, $base_pairs{C}, $base_pairs{G}, $base_pair +s{T}) = split /\s+/, $line, 5; my @letters = keys %base_pairs; # Start with the first column value and make it the max. value. my $max = pop @letters; # Compare each value to the maximum. foreach my $letter (@letters) { # What if two (or more) values are equal??? if ($base_pairs{$max} < $base_pairs{$letter}) { # The current value was grater than the maximum, so make i +t the new maximum. $max = $letter; } } # Print the letter representing the maximum value. print $max; } # print an end of line character. print "\n"; __DATA__ TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1|2x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.538498 0.157305 0.157633 0.146564 2 0.072844 0.008771 0.877166 0.0412175 3 0.959269 0.013107 0.015961 0.0116621 4 0.852439 0.023883 0.016813 0.106864 5 0.57332 0.068801 0.181385 0.176494 6 0.139513 0.074798 0.737607 0.0480813 7 0.735484 0.091299 0.09091 0.0823067 8 0.79932 0.027041 0.137306 0.0363319 9 0.16103 0.12536 0.109938 0.603672 10 0.622356 0.06782 0.115463 0.194361
Output:
ENSG00000113916___1|2x3 AGAAAGAATA

Cheers,

Brent

-- Yeah, I'm a Delt.

Replies are listed 'Best First'.
Re^2: First foray into Perl
by Anonymous Monk on Mar 24, 2014 at 22:18 UTC

    Thank you very much Brent!

    As you suspected I am a biologist and I thought I'd see what Perl has to offer. Thanks for such a comprehensive and detailed piece of code, I really appreciate your time and effort. Out of interest, could the "foreach (1 .. 10)" loop be modified to accommodate variation in the number of data lines? something like:

    foreach (integer),else exit loop

    Cheers!

      What comes on the line after the last (highest) digit, 10 in your sample. A blank line maybe? Please post more than 1 record showing if anything indicates you've reached the last pos for the record.

        The next line is:

        TF Name Unknown/

        So far I've tried a do-until loop:

        foreach (1..25) { my $command do { STUFF TO EXECUTE } until ($command eq "TF"); }

        Any pointers (in the literary sense) gratefully received! Cheers

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1079571]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2020-06-05 03:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you really want to know if there is extraterrestrial life?



    Results (35 votes). Check out past polls.

    Notices?