If we're going for raw speed: don't use perl. :)
If I were doing this in assembly, and I wanted raw speed I'd:
- Generate all of the possible combinations and their sorted values, like so: AA => AA, AB => AB, BA => AB, AC => AC, CA => AC.
- Generate code (don't write it by hand!) that does something along the lines of pseudocode which works for aa, ab, ba, and bb:
if (substr($base,0,1) eq 'a') {
if (substr($base,1,1) eq 'a') {
return 'aa';
}
if (substr($base,1,1) eq 'b') {
return 'ab'
}
}
if (substr($base,0,1) eq 'b') {
if (substr($base,1,1) eq 'a') {
return 'ab'
}
if (substr($base,1,1) eq 'b') {
return 'bb'
}
}
Which means that for any possible code of length n and an alphabet length q there's only n*q possible comparison/jumps to be made at worst case. (AGCA would be translated to AACG using only 7 comparisons and jumps total for example.)
I'm fairly confident that this would outperform any solution using a hash or a split/join/sort. At least, in assembler. I'm just a little too harried to write code to prove that it might be faster in Perl.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|