note
tobyink
<p>Instead of matching <em>valid</em> sequences, match <em>invalid</em> characters. Then use <c>$-[0]</c> to find the position of that match. (The <c>@-</c> array is documented in the "perlvar" manual page.)</p>
<c>
use strict;
use warnings;
while (my $sequence = <DATA>) {
chomp $sequence;
if ($sequence =~ /[^ATCG]/){
warn "Sequence '$sequence' has invalid character after " . $-[0];
}
else {
print "Valid sequence: '$sequence'\n";
}
}
__DATA__
TAAGAACAATAAGAACAA
TAAGAACAATAAUAACAA
TAAGAACAATAAGAACAA
</c>
<p>You don't need to split the sequence up into individual characters and process each one separately. That's slow.</p>
<div class="pmsig"><div class="pmsig-757127">
<small><a href="http://toby.ink/">toby döt ink</a></small>
</div></div>
11113020
11113020