http://qs321.pair.com?node_id=1079571


in reply to First foray into Perl

Hello,

I might be wrong, but I get the feeling this is more of a biology thing for you than it is a Perl thing, so I'm helping out more than I usually do. Please make sure you understand the code below (and its limitations) before you use it.

Good luck.

use strict; use warnings; # Read the first 7 lines of metadata. # Assuming there are always 7 lines of metadata. foreach (1..7) { # Read a line of data. my $header_data = <DATA>; # Remove the end of line character. chomp $header_data; # Split the string into 2 parts, using white space as a separator. my ($lable, $string) = split /\s+/, $header_data, 2; # only pay attention to the "Motif" line. next if ($lable ne 'Motif'); print "$string "; } # Process the next 10 lines of data. # Assuming there are always 10 lines of data. foreach (1..10) { # Declare a variable to hold the data in the file. my %base_pairs; # Read a line of data. my $line = <DATA>; # Remove the end of line character. chomp $line; # Split the string into 5 parts, using whitespace as a separator. # Assuming the Position is always in the same order in the file. (undef, $base_pairs{A}, $base_pairs{C}, $base_pairs{G}, $base_pair +s{T}) = split /\s+/, $line, 5; my @letters = keys %base_pairs; # Start with the first column value and make it the max. value. my $max = pop @letters; # Compare each value to the maximum. foreach my $letter (@letters) { # What if two (or more) values are equal??? if ($base_pairs{$max} < $base_pairs{$letter}) { # The current value was grater than the maximum, so make i +t the new maximum. $max = $letter; } } # Print the letter representing the maximum value. print $max; } # print an end of line character. print "\n"; __DATA__ TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1|2x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.538498 0.157305 0.157633 0.146564 2 0.072844 0.008771 0.877166 0.0412175 3 0.959269 0.013107 0.015961 0.0116621 4 0.852439 0.023883 0.016813 0.106864 5 0.57332 0.068801 0.181385 0.176494 6 0.139513 0.074798 0.737607 0.0480813 7 0.735484 0.091299 0.09091 0.0823067 8 0.79932 0.027041 0.137306 0.0363319 9 0.16103 0.12536 0.109938 0.603672 10 0.622356 0.06782 0.115463 0.194361
Output:
ENSG00000113916___1|2x3 AGAAAGAATA

Cheers,

Brent

-- Yeah, I'm a Delt.