in reply to Lower-casing Substrings and Iterating Two Files together
If you bitwise or (|) an uppercase letter with a space, (assuming latin-1/ASCII files), it will lowercase it:
print 'ACGT' | ' ';; acgt
So, if you translate all the 'N's in your mask to spaces and then bitwise or the sequence and the mask, it will achieve your goal very efficiently:
$s = 'GGTACACAGAAGCCAAAGCAGGCTCCAGGCTCTGAGCTGTCAGCACAGAGACCGAT';; $m = 'GGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNT';; ( $mm = $m ) =~ tr[N][\x20];; print $mm;; GGT T print $s | $mm;; GGTacacagaagccaaagcaggctccaggctctgagctgtcagcacagagaccgaT
Which makes your entire program (excluding the unmentioned fact that your files may be in FASTA format):
#! perl -slw use strict; open SEQ, '<', 'data1.dat' or die $!; open MASK, '<', 'data2.dat' or die $!; while( my $seq = <SEQ> ) { ## Read a sequence my $mask = <MASK>; ## And the corresponding mask $mask =~ tr[N][ ]; ## Ns => spaces print $seq | $mask; ## bitwise-OR them and print the result } close SEQ; close MASK;
Redirect the output to a third file and you're done.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Lower-casing Substrings and Iterating Two Files together
by tilly (Archbishop) on Dec 27, 2008 at 15:01 UTC | |
by BrowserUk (Patriarch) on Dec 27, 2008 at 16:14 UTC | |
by tilly (Archbishop) on Dec 27, 2008 at 18:02 UTC | |
by BrowserUk (Patriarch) on Dec 27, 2008 at 19:01 UTC | |
Re^2: Lower-casing Substrings and Iterating Two Files together
by neversaint (Deacon) on Dec 29, 2008 at 05:32 UTC |
In Section
Seekers of Perl Wisdom