Re^3: First foray into Perl

Replies are listed 'Best First'.
Re^4: First foray into Perl by Anonymous Monk on Mar 25, 2014 at 00:04 UTC
The next line is: `TF Name Unknown/` [download] So far I've tried a do-until loop: `foreach (1..25) { my $command do { STUFF TO EXECUTE } until ($command eq "TF"); }` [download] Any pointers (in the literary sense) gratefully received! Cheers	[reply] [d/l] [select]
Re^5: First foray into Perl by LostWeekender (Novice) on Mar 25, 2014 at 17:23 UTC
.. bit stuck. Here's what I have so far. Seems close but I can't quite get it right. Please help! use strict; use warnings; open(MOTIFS, "all_motifs.txt") or die("Unable to open file"); # Read the first 7 lines of metadata. # # Assuming there are always 7 lines of metadata. foreach (1..121) { foreach (1..7) { # Read a line of data. my $header_data = <MOTIFS>; # Remove the end of line character. chomp $header_data; # Split the string into 2 parts, using white space as a separator +. my ($lable, $string) = split /\s+/, $header_data, 2; # only pay attention to the "Motif" line. next if ($lable ne 'Motif'); print "$string "; } # Process the next lines of data until line containing string "TF + Unknown" is reached. foreach (<MOTIFS>) { # Remove the end of line character. chomp my $line; # Process lines until "TF Unknowm' while ($line ne 'TF Unknown') { # Declare a variable to hold the data in the file. my %base_pairs; # Split the string into 5 parts, using whitespace as a separ +ator. # Assuming the Position is always in the same order in the f +ile. (undef, $base_pairs{A}, $base_pairs{C}, $base_pairs{G}, $bas +e_pairs{T}) = split /\s+/, $line, 5; my @letters = keys %base_pairs; # Start with the first column value and make it the max. val +ue. my $max = pop @letters; # Compare each value to the maximum. foreach my $letter (@letters) { # What if two (or more) values are equal??? if ($base_pairs{$max} < $base_pairs{$letter}) { # The current value was greater than the maximum, so +make it the new maximum. $max = $letter; } } # Print the letter representing the maximum value. print $max; } } } # print an end of line character. print "\n"; [download] and this is what a few records look like: TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1\|1x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.664794 0.13099 0.0810125 0.123203 2 0.675621 0.0396475 0.144967 0.139764 3 0.0913393 0.0396819 0.847004 0.0219745 4 0.850414 0.0522149 0.0519174 0.0454536 5 0.89157 0.00962148 0.0845269 0.0142814 6 0.122389 0.0875591 0.0734604 0.716591 7 0.226696 0.00745549 0.745549 0.0202999 8 0.156228 0.151994 0.128767 0.563011 9 0.22083 0.561173 0.12007 0.0979266 10 0.507656 0.0711684 0.0652815 0.355894 TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1\|2x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.538498 0.157305 0.157633 0.146564 2 0.0728444 0.00877167 0.877166 0.0412175 3 0.959269 0.0131077 0.0159611 0.0116621 4 0.852439 0.0238831 0.0168134 0.106864 5 0.57332 0.0688014 0.181385 0.176494 6 0.139513 0.0747988 0.737607 0.0480813 7 0.735484 0.0912993 0.09091 0.0823067 8 0.79932 0.0270417 0.137306 0.0363319 9 0.16103 0.12536 0.109938 0.603672 10 0.622356 0.06782 0.115463 0.194361 TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1\|3x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.616484 0.0886488 0.24602 0.0488468 2 0.0971289 0.591289 0.134781 0.176801 3 0.0715039 0.0237142 0.0432674 0.861514 4 0.73769 0.117011 0.059703 0.0855963 5 0.0728444 0.00877167 0.877166 0.0412175 6 0.959269 0.0131077 0.0159611 0.0116621 7 0.852439 0.0238831 0.0168134 0.106864 8 0.57332 0.0688014 0.181385 0.176494 9 0.139513 0.0747988 0.737607 0.0480813 10 0.615257 0.189034 0.125514 0.0701943 TF Unknown TF Name Unknown Gene ENSG00000113916 [download] massive thanks!	[reply] [d/l] [select]
Re^6: First foray into Perl by AnomalousMonk (Archbishop) on Mar 25, 2014 at 18:41 UTC
... this is what a few records look like ... It would greatly help anyone who is trying to help you if you would also provide the exact output you expect from the example input. (You can just add an update – properly cited as such – to your post; no need for a separate node.)	[reply]
Re^6: First foray into Perl by AnomalousMonk (Archbishop) on Mar 25, 2014 at 19:18 UTC
Note: The single-tab (`\t`) field separators used in the example data posted with your OP seem to have been entirely replaced with multiple-space (`\x20`) character separators in the example data included with your latest post. Was this intentional or inadvertent? I'm working on some code that (I think) should work with either field separation scheme, but it would be good to know just how record fields will be separated.	[reply] [d/l] [select]
Re^6: First foray into Perl by AnomalousMonk (Archbishop) on Mar 25, 2014 at 20:35 UTC
Another Note: I just noticed that the latest example data here end with the following three lines: `TF Unknown TF Name Unknown Gene ENSG00000113916` [download] These look like the start of another record and screw up parsing. Is this the intended and necessary ending of a full set of data records? It's a problem if so. (The absence of an unambiguous multi-line record delimiter is another problem, but can be finessed if necessary.) Also: Any word yet on the tab-versus-spaces field delimiter question posted above?	[reply] [d/l]
Re^7: First foray into Perl by LostWeekender (Novice) on Mar 25, 2014 at 21:48 UTC
Re^8: First foray into Perl by AnomalousMonk (Archbishop) on Mar 26, 2014 at 00:38 UTC
Some notes below your chosen depth have not been shown here


Come for the quick hacks, stay for the epiphanies.
	PerlMonks