Re: Find element in array

in reply to Find element in array

Does this do what you want? There is no need to split the sequence into an array as pos will allow you to find where in a string a match has been made. Note that [^ACGT] is a negative character class, i.e. match anything that isn't A, C, G or T. Using capturing parentheses, ( ... ), and matching globally, m{ ... }g or / ... /g will advance along the sequence looking for invalid letters.

I am opening a file that is held inside the script just to keep things tidy on my system but the code will work fine with STDIN. The code.

use 5.026;
use warnings;

open my $dnaFH, q{<}, \ <<__EOD__ or die $!;
TAAGAACAATAAGAACAAGAACAATAA
GAACAATAAGXAATAAGAAXXAACAAGAACAATAA
ACAATAAAAGAACAATAAGAA
__EOD__

while ( my $sequence = <$dnaFH> )
{
    chomp $sequence;
    my $length = length $sequence;
    say qq{Sequence: $sequence -- Length $length};
    if ( $sequence =~ m{^[ACGT]+$} )
    {
        say q{     Sequence is GOOD!};
    }
    else
    {
        my @badPosns;
        push @badPosns, pos $sequence
           while $sequence =~ m{(?x) (?= ( [^ACGT] ) )}g;
        my $nBad = scalar @badPosns;
        my $perc = sprintf q{%.2f}, $nBad / $length * 100;
        say qq{     Sequence is BAD at @badPosns};
        say qq{     $nBad bad positions, $perc\% of total};
    }
}

close $dnaFH or die $!;
[download]

The output.

Sequence: TAAGAACAATAAGAACAAGAACAATAA -- Length 27
     Sequence is GOOD!
Sequence: GAACAATAAGXAATAAGAAXXAACAAGAACAATAA -- Length 35
     Sequence is BAD at 10 19 20
     3 bad positions, 8.57% of total
Sequence: ACAATAAAAGAACAATAAGAA -- Length 21
     Sequence is GOOD!
[download]

I hope this is helpful. Please ask further if you need more help.

Update: There was a mistake in the code, I should have used a look-ahead assertion as without that pos gives the position after the match, not that of the match itself. Added extended syntax ((?x)) to make the regex clearer. My bad :-(

Update 2: I should also have corrected the output, now done.

Cheers,

JohnGG

Comment on Re: Find element in array Select or Download Code

In Section Seekers of Perl Wisdom