The 'classic' Perlish approach to this type of problem involves bitwise string boolean operations. The string $diff generated by the bitwise-xor of characters in original sequence strings can be used to produce masks that can then be used to extract the differing sub-string sequences from the original strings.
use warnings;
use strict;
my $s1 = 'ACTGGACGTATGCA';
my $s2 = 'AGTG-ACGC-CGCA';
my $diff = $s1 ^ $s2;
my @dpos;
push @dpos, [ $-[1], $+[1] - $-[1] ]
while $diff =~ m{ ([^\x00]+) }xmsg;
print qq{diff at offset $_->[0], length $_->[1] \n}
for @dpos;
(my $mask = $diff) =~ tr{\x00}{\xff}c;
$s1 &= $mask;
$s2 &= $mask;
my $differences = qr{ [^\x00]+ }xms;
@dpos = ();
while ($s1 =~ m{ ($differences) }xmsg) {
# this code produces same result
# my @diff_data = ($1);
# $s2 =~ m{ ($differences) }xmsg;
# push @diff_data, $1, $-[1];
# push @dpos, \@diff_data;
push @dpos,
[ $1, do { $s2 =~ m{ ($differences) }xmsg && $1, $-[1] } ]
;
}
print qq{@$_ \n} for @dpos;
Output:
diff at offset 1, length 1
diff at offset 4, length 1
diff at offset 8, length 3
C G 1
G - 4
TAT C-C 8
See @- and @+ in perlvar, also Bitwise Or and Exclusive Or and Bitwise And in perlop.
BrowserUk is very good on this general topic.
Update: Added better code example, doc links. And thanks to ELISHEVA.
Update: Fixed @- link above. What was I thinking?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.