Syntactic Confectionery Delight PerlMonks

### Re^2: longest common substring

by lima1 (Curate)
 on Jan 02, 2008 at 18:41 UTC ( #660029=note: print w/replies, xml ) Need Help??

in reply to Re: longest common substring

For more than 2 strings, I'd also suggest Tree::Suffix (although I've never used this module. It is just in theory the best algorithm).

Replies are listed 'Best First'.
Re^3: longest common substring
by Limbic~Region (Chancellor) on Jan 02, 2008 at 19:33 UTC
lima1,
The original thread I referenced was for longest common subsequence. I still have yet to find an implementation for finding the longest common subsequences for > 2 strings other than the one I posted. I adapted it for common substrings to show it was possible. I would definitely use and recommend Tree::Suffix now that I know about it. I mentioned my home grown implementation in this thread in case the XS version of the module flaviusm was using was also broken. Thanks!

Cheers - L~R

I think the best exact solutions for the longest common subsequence problem are the dynamic programming ones. But this is only reasonable for 2-4 strings (0(n^2) to O(n^4)). So people use heuristics. A trivial one is to calculate a simple (tree) clustering with something like UPGMA. For example:
```\$a='AAAB';
\$b='AABB';
\$c='ABBB';
\$d='BBBB';

+---\$a
+----+
|    +---\$b
|
+
|
|    +---\$c
+----+
+---\$d
Then you calculate the subsequences of the leafs (a,b) and (c,d) and in the next step (according some special rules) the subsequences of the subsequences.

Create A New User
Node Status?
node history
Node Type: note [id://660029]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?