Re: motif finding

A more "Perl-ish" way of doing the same thing...

use strict;
use warnings;
use Term::ANSIColor;
use autodie;

#Program to find motif site in a given protein sequence using files

my $motif = "AGGGGG";
open( my $read, "<dna.txt" );
my @e = <$read>;
$_ = join( " ", @e );
s/\s+//g;
my @c;
push @c, pos( ) - length( $motif ) + 1 while /$motif/g;
s/$motif/color( 'bold green' ) . $motif . color( 'black' )/eg;
print $_, "\n";
print "Number of sites the motif (AGGGGG) is present: ", scalar @c, "\
+n";
print "And the positions in the string are: ", join( ',', @c ), "\n\n"
+;
[download]

The eliminates counting characters one at a time, as in the $i loop in the original, in favor of using pattern matching on the entire character string. I have found that eliminating loop counters wherever possible greatly reduces the number of bugs in my code.

Comment on Re: motif finding Download Code

Replies are listed 'Best First'.
Re^2: motif finding by educated_foo (Vicar) on Jan 31, 2012 at 13:55 UTC
Or, even more Perl-ish, with a bit less extra work (e.g. only one //g loop): `use Term::ANSIColor; open(READ,"<dna.txt"); $m = 'AGGGGG'; $_ = do { local $/; <READ> }; # read whole file s/\s+//g; # remove blanks s{$m}{ # search the string push @c, 1 - length($m) + pos; # remember position color('bold green').$m.color('reset'); # remember to reset! }eg; print "$_\n"; # print transformed string print "NUMBER OF SITES THE MOTIF ($m) IS PRESENT: ".@c."\n"; print "AND THE POSITION IN THE STRING IS:", join(',', @c), "\n\n";` [download]	[reply] [d/l]
Re^3: motif finding by Anonymous Monk on Oct 05, 2013 at 21:07 UTC
my motif input is a file, how i can modified the program to make it work?	[reply]
Re^2: motif finding by RichardK (Parson) on Jan 31, 2012 at 14:19 UTC
I think using File::Slurp is even easier and more perl-ish :) `use File::Slurp; # read file as a string my $text = read_file('dna.txt'); # now remove whitespace including line breaks $text =~ s/\s+//g; # ...do stuff` [download] (update : removed a stray space)	[reply] [d/l]
Re^3: motif finding by devi (Initiate) on Feb 03, 2012 at 06:08 UTC
Thank you very much for the reply. In the code that i have used i am giving the input (the motif sequence). considering entire genome as a single string if i want the most repeated elements of say 20 base pairs in the entire string how can i find it?	[reply]
Re^4: motif finding by RichardK (Parson) on Feb 03, 2012 at 12:47 UTC
I'm not sure what you're looking for, can you explain with a simple example? Are you looking for repeats of given string or something more complex?	[reply]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks