comment on

Without addressing parallelization nor performing any benchmarks, here's naive implementation. Hopefully, fuzzy matching implemented and optimized in C is fast. Maybe, re-writing this with String::Approx instead of re::engine::TRE (subroutine call instead of Perl regexp engine overhead) would be faster. Re-visiting fuzzy string matching was fun :). Solution below was more readable/clear before I tried to get to (perceived, no tests) optimizations like dropping blocks, reversing loop, + remembering that to use a reference to substr result is efficient (no idea if it holds in this case), etc., but here's FWIW.

use strict;
use warnings;
use feature 'say';

use constant ERR => 1;
use re::engine::TRE 
    max_cost => ERR,
    cost_ins => -1,     # no insertions
    cost_del => -1,     # no deletions
;

my $suffix_source = 'abbaba';
my $prefix_source = 'babbaaaa';

my $max_len = 0;
$prefix_source =~ /^${ \substr $suffix_source, -$_ }/
    and $max_len = $_
    and last 
for reverse ERR + 1 .. length $suffix_source;

say substr( $suffix_source, -$max_len ),
    ' ',
    substr( $prefix_source, 0, $max_len )
if $max_len;

__END__

baba babb
[download]

Update: crude benchmarks (~200 chars strings, ~10 errors allowed) reveal that, for this task, both modules I mentioned are hugely (some 10s of times) slower than "classic" Perl implementation by tybalt89 :)

In reply to Re: Suffix-prefix matching done right by vr
in thread Suffix-prefix matching done right by baxy77bax

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Syntactic Confectionery Delight
	PerlMonks