Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Longest common substring

by japhy (Canon)
on Feb 15, 2002 at 03:26 UTC ( [id://145608]=CUFP: print w/replies, xml ) Need Help??

To find the longest substring shared by two strings.
sub longest_common_substr { # provided you know there are no NULs my $str = join "\0", @_; my $len = 1; my $match; while ($str =~ m{ ([^\0]{$len,}) (?= [^\0]* \0 [^\0]*? \1 ) }xg) { $len = length($match = $1) + 1; } return $match; }

Replies are listed 'Best First'.
Re: Longest common substring
by blakem (Monsignor) on Feb 16, 2002 at 00:29 UTC
    At the risk of getting rebuked again while commenting on a japhy regex ;-P

    Overlapping matches seem to cause a problem... For example, the last four characters in 'abcabc' and 'caWcabc' match, yet the function only returns the last three.

    print longest_common_substr('abcabc','caWcabc'); # 'abc' not 'cabc'
    I think it might be as simple as removing the /g (but thats the part I don't fully comprehend....) The code forces each match to be bigger than the previous one, with greedy matching helping us rachet up several steps at a time. The /g might be doing something else tricky, but I don't see it.

    Update: Other word pairs that fail similarly are sense/tense and onion/union.

    -Blake

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://145608]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-18 17:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found