http://qs321.pair.com?node_id=1218349


in reply to compare initial

First, you'll want to use eq when testing equality of strings in Perl and == when testing equality of numbers. Second, $a and $b are special variables in Perl used in sorting, so don't name normal things that way.

Your specification isn't quite clear. It seems to me you're looking at generations and you want to know when generation n+1 shows a mutation of one single base from generation n. There's https://bioperl.org/ for that sort of work if you need it. There are set libraries as BillKSmith mentioned as well in case that's what you need.

As far as your specific case, if you're indeed wanting to compare A to B and B to C but not A to C, I have some working code I threw together for you. Some loops (whether explicit in your code or implicit in the language or in a library beneath your code) are necessary because you're doing repeated checks over different combinations of inputs. One can try to minimize the number of such loops, but loops (possibly mixed with recursion) is how these combinations get checked. If what you actually need is every string tested against every other string then you need even another loop and, yes, it will complete even more slowly given all else stays equal. Forking, threading, IPC, flow-based programming, and similar topics are unnecessary for three strings of seven characters, but enough very long strings might require something other than a simple single-threaded, single process approach to run in an acceptable amount of time.

This code is quite particular to the problem as I restated it. There's probably a dozen better ways to do this, and some of those will be more generalizable.

#!perl use strict; use warnings; my $difference = 1; my @generations = ( 'TTTATTT', 'TTTTTTTT', 'TBTTTTT' ); my %mismatch; my $times = $difference == 1 ? 'time' : 'times'; my $columns = $difference == 1 ? 'column: ' : "columns:\n"; my $length = length $generations[0]; for ( my $column = 0 ; $column < $length ; $column++ ) { for ( my $gen = 1 ; $gen < scalar @generations ; $gen++ ) { my $last = substr $generations[ $gen-1 ], $column, 1; my $current = substr $generations[ $gen ], $column, 1; push @{ $mismatch{ $gen } }, [ $column, $last, $current ] unle +ss $last eq $current; } } my $out = ''; for my $mm ( sort keys %mismatch ) { if ( scalar @{ $mismatch{ $mm } } == $difference ) { $out .= sprintf "mismatched %d %s between gens %d and %d at %s +", $difference, $times, $mm, $mm-1, $columns; foreach my $c_l_c ( @{ $mismatch{ $mm } } ) { $out .= sprintf " %d ( %s and %s ),\n", @$c_l_c; } $out .= "\n"; } } print $out;

If you need the amount of difference to change or the input strings to change, that's not at all difficult. They could come from the command line or configuration files or a data file of some sort.