I'm not sure if this is exactly what you're looking for, but you can find longest common subsequences with Algorithm::Diff. With that, you could find the lengths of the differing sequences and divide that by the total length. (If you have trouble understanding the documentation for Algorithm::Diff, I wrote a module review which may help.
If the code snippet you have above is an accurate description of what you're trying to calculate, it may be faster (though more memory-intensive) to split up the strings into arrays and compare an element at a time, rather than calling substr over and over. Something like this: (note, this is untested)
my @ref_elems = split //, $ref_seq;
my @test_elems = split //, $test_seq;
my $score = 0;
for (my $i = 0; $i < $len; $i++)
{
$score += $ref_elems[$i] eq $test_elems[$i];
}
Once you have the sequences in arrays, you can use all kinds of nifty techniques like mapcar, which can traverse both arrays in one neat statement. The top of that node has a very clear explanation of how to use it.
HTH
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|