The longest common substring algorithm on that page is (m * n) time but requires (m * n) space as well. Contrast to the naive solution, which is (m * m * n) time but (m + n) memory.
That said, since the OP has strings of length 3000 characters, we're looking at only 3000 * 3000 * sizeof(uint16_t) = 18 megabytes of space. If the strings were, say, 100k each, we'd have problems.
So, the name of this site notwithstanding, here's some C code, poorly tested
Maybe this would be a good time for me to learn Inline::C.
| [reply] [Watch: Dir/Any] [d/l] |
#! perl -slw
use strict;
#use Inline 'INFO';
use Inline C => 'DATA', NAME => 'LCS', CLEAN_AFTER_BUILD => 1;
my( $len, $offset0, $offset1 ) = LCS( @ARGV );
$ARGV[ 0 ] =~ s[(.{$offset0})(.{$len})][$1<$2>];
$ARGV[ 1 ] =~ s[(.{$offset1})(.{$len})][$1<$2>];
print for @ARGV;
__END__
[ 9:10:28.57] P:\test>DynLCS-C hello aloha
hel<lo>
a<lo>ha
__C__
#define IDX( x, y ) (((y) * an)+(x))
/*
LONGEST COMMON SUBSTRING(A,m,B,n)
for i := 0 to m do Li,0 := 0
for j := 0 to n do L0,j := 0
len := 0
answer := <0,0>
for i := 1 to m do
for j := 1 to n do
if Ai ? Bj then
Li,j := 0
else
Li,j := 1 + Li-1,j-1
if Li,j > len then
len := Li,j
answer = <i,j>
*/
void LCS ( char* a, char*b ) {
Inline_Stack_Vars;
int an = strlen( a );
int bn = strlen( b );
int*L;
int len = 0;
int answer[2] = { 0,0 };
int i, j;
Newz( 42, L, an * bn, int );
for( i = 1; i < an; i++ ) {
for( j = 1; j < bn; j++ ) {
if( a[ i ] != b[ j ] ) {
L[ IDX(i,j) ] = 0;
}
else {
L[ IDX(i,j) ] = 1 + L[ IDX(i-1, j-1) ];
if( L[ IDX(i,j) ] > len ) { // xs(70)
len = L[ IDX(i,j) ];
answer[ 0 ] = i;
answer[ 1 ] = j;
}
}
}
}
Safefree( L );
Inline_Stack_Reset;
Inline_Stack_Push(sv_2mortal(newSViv( len )));
Inline_Stack_Push(sv_2mortal(newSViv( answer[ 0 ] - len + 1 )));
Inline_Stack_Push(sv_2mortal(newSViv( answer[ 1 ] - len + 1 )));
Inline_Stack_Done;
}
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
| [reply] [Watch: Dir/Any] [d/l] |