If you understood the mental process I went through to create the Longest Common Subsequence, it should not be hard to see how to adapt it to the Longest Common Substring.
Once we have a list of character positions across strings, we can create ordered piles where each mapping has a difference of 1 in each position with its adjacent neighbors. You can get rid of the need for looping to find mappings greater because you can predict the values they should contain. Then you just need to find the largest pile and that represents the LCS (S = substring).
Arrays are replaced with hashes. Picking a hash key at random, you place it in a new pile and then predict the next mapping. If that key exists you place it on the same pile. You repeat this process until you have no more mappings. You then start back at the beginning of the pile and look for smaller items. Once no more items fit in that pile, you move on to the next hash key.
use Algorithm::Loops 'NestedLoops';
use List::Util 'reduce';
my @str = map {chomp; $_} <DATA>;
print LCS(@str), "\n";
sub LCS{
my @str = @_;
my @pos;
for my $i (0 .. $#str) {
my $line = $str[$i];
for (0 .. length($line) - 1) {
my $char= substr($line, $_, 1);
push @{$pos[$i]{$char}}, $_;
}
}
my $sh_str = reduce {length($a) < length($b) ? $a : $b} @str;
my %map;
CHAR:
for my $char (split //, $sh_str) {
my @loop;
for (0 .. $#pos) {
next CHAR if ! $pos[$_]{$char};
push @loop, $pos[$_]{$char};
}
my $next = NestedLoops([@loop]);
while (my @char_map = $next->()) {
my $key = join '-', @char_map;
$map{$key} = $char;
}
}
my @pile;
for my $seq (keys %map) {
push @pile, $map{$seq};
for (1 .. 2) {
my $dir = $_ % 2 ? 1 : -1;
my @offset = split /-/, $seq;
$_ += $dir for @offset;
my $next = join '-', @offset;
while (exists $map{$next}) {
$pile[-1] = $dir > 0 ? $pile[-1] . $map{$next} : $map{
+$next} . $pile[-1];
$_ += $dir for @offset;
$next = join '-', @offset;
}
}
}
return reduce {length($a) > length($b) ? $a : $b} @pile;
}
__DATA__
qwertyJoyzhnac
Jyzshuaqwertyb
Joyzqwertybbc
Yqwertyblah
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|