http://qs321.pair.com?node_id=486365


in reply to Re: Fast common substring matching
in thread Fast common substring matching

It's a special case and I don't think it is a problem in practice. It only happens when there is a block longer than $subStrSize (the minimum match quanta) with a repeated pattern. Test strings and results are shown below:

>string1 01010101010ddddddddddd01234566789a12345yy >string2 0123456789b12345eeeeeeeeeeeex01010101010x >string3 0123456789c12345ffffffffffff01010101010zz 000:001 L[ 11] ( 0 29) 000:002 L[ 11] ( 0 28) 001:002 L[ 10] ( 0 0) Completed in 0.002126 Best match: >string1 - >string2. 11 characters starting at 0 and 29. Best match: >string1 - >string3. 11 characters starting at 0 and 28.

Perl is Huffman encoded by design.