http://qs321.pair.com?node_id=11113037


in reply to Find element in array

Unlike the "C" language, Perl strings are much different from arrays. In Perl, you usually have a choice of which to use. String solutions are usually easier and almost always execute faster. Here is a string-only solution to your problem.
use strict; use warnings; my $DNA = "ATATCCCGATCAGG3TT!GCA\n"; chomp $DNA; print "The length of the sequence is:\n", length($DNA), "\n"; my $nucleotideDNA = $DNA; #my $count = $nucleotideDNA =~ tr/ATCG]//c; # Remove and count invali +ds my $count = $nucleotideDNA =~ tr/ATCG//cd; # Remove and count invalid +s my $locations; # Find location of invalids in original string $locations .= "$-[0], " while ( $DNA =~ /[^ATCG]/g ); print "There are $count non-valid nucleotides at locations:\n$location +s \n"; OUTPUT: The length of the sequence is: 21 There are 2 non-valid nucleotides at locations: 14, 17,

UPDATE: Modified one line of code to correct errors identified by AnomalousMonk (below) Original remains as comment.

Bill

Replies are listed 'Best First'.
Re^2: Find element in array
by AnomalousMonk (Archbishop) on Feb 17, 2020 at 06:23 UTC
    my $count = $nucleotideDNA =~ tr/ATCG]//c;  # Remove and count invalids

    Sofie:   Note that while this  tr/// (see Quote-Like Operators in perlop) expression counts the number of characters that are not ATCG, it does not remove anything; the string is not changed (update: nor is there any need for change):

    c:\@Work\Perl\monks>perl -wMstrict -le "my $DNA = 'ATATCCCGATCAGG3TT!GCA'; ;; my $nucleotideDNA = $DNA; my $count = $nucleotideDNA =~ tr/ATCG//c; ;; print $DNA; print $nucleotideDNA; ;; print 'sequences are equal' if $DNA eq $nucleotideDNA; " ATATCCCGATCAGG3TT!GCA ATATCCCGATCAGG3TT!GCA sequences are equal
    Also note that there is a  ] character in the set |  tr/// search set that should not be there.

    However, I agree with the main point that BillKSmith is making: string operations with regexes or with operators like substr and index will tend to be significantly faster (update: and to consume significantly less memory) than equivalent array operations.


    Give a man a fish:  <%-{-{-{-<