Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Sorry. The code is flawed. It does produce all the common substrings, but it will often select the wrong "longest".

The problem occurs because if a substring occurs twice in one of the input strings, and not at all in one of the others, it's count will be the same as if it had appeared once in both, The selection mechanism, the longest key who's count is equal to the number of input strings is bogus, but suffuciently convincing that it worked for all 5 sets of test data I tried it on!

I'm trying to think of an efficient way of counting how many of the original strings each substring is found in, but the only one I've come up with so far would limit the number of input strings to 32. A couple of other ideas I tried worked, but carry enough overhead to make the method less interesting.

I'll keep looking at it, but maybe my "surprise at the simplicity and efficiency" was the red flag that should have told me that I was missing something! Still, nothing ventured, nothing gained.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!


In reply to Re: Re: Re: finding longest common substring (ALL common substrings) by BrowserUk
in thread finding longest common substring by revdiablo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-23 23:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found