I suspect there is no "good and fast" solution. This problem, to me, has the flavor of a permutation or "traveling salesman"
problem, and is probably NP-hard (or one of the other NP classes).
So here's an attempt (with a cheat) that at least gets a solution in a sort of reasonable time for this problem.
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11118281
use warnings;
my @list=("set abcde-efghi 12345",
"set abcde-ijkl 12345",
"clr abcde-efghi+123",
"clr abcde-ijkl 12345");
#my @expected_substrings=("set","clr"," abcde-","efghi",
# "ijkl"," 12345","+123");
#my $len=@expected_substrings*2;
#$len+=length($_) foreach @expected_substrings;
$_ = join "\n", @list;
my $max = 3; ########################################### BIG CHEAT FOR
+ RUNTIME
sub score { 2 * @_ + length join '', @_; }
my $best = score( @list );
try( $_ );
print "\n";
sub try
{
(local $_, my @sofar) = @_;
if( !/[ -~]/ )
{
my $score = score @sofar;
if( $score < $best )
{
print "\n";
use Data::Dump 'dd'; dd $score, @sofar;
$best = $score;
}
return;
}
score(@sofar) >= $best and return;
for my $n ( reverse 0 .. $#list )
{
my %d;
/([ -~]{3,})(?:.*?\1){$n}(?{ $d{$1}++ })(*FAIL)/s;
my @d = sort { length $b <=> length $a } sort keys %d;
@d > $max and $#d = $max - 1;
for my $string ( @d )
{
try( s/\Q$string\E/\t/gr, @sofar, $string );
}
}
}
Outputs:
(46, " abcde-", " 12345", "efghi", "ijkl", "clr", "set", "+123")
where the "46" is the score and the rest are the substrings.
As it finds better scores, it will print them, but so far always seems to find a best solution first.
|