Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Re: Re: Optimizing a string processing sub

by sauoq (Abbot)
on Jan 08, 2003 at 22:36 UTC ( [id://225379]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Optimizing a string processing sub
in thread Optimizing a string processing sub

I suspect that if both words have 5 characters, a declaration that the words have 5 characters in common, with different words, may be unexpected.

The case where the words are equal is a special one in the original code and results in $words_equal being returned. Given that fact, I think that 5 would be expected.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re: Re: Re: Re: Optimizing a string processing sub
by MarkM (Curate) on Jan 08, 2003 at 23:15 UTC

    I guess that all depends on what the goal of the scoring is (it hasn't been stated by the original poster yet... :-) ).

    I assume that the goal has something to do with determining how 'close' two words are to each other. "aabcc" and "abbbc" are "5 close" according to the algorithms described. "aabcc" and "abc" are also "5 close". Is this still expected?

      "aabcc" and "abc" are also "5 close". Is this still expected?

      Well, they are either 5-close or 3-close depending on what order you submit them to the original function. (I really do think a commutative version, like dragonchild's second one, is the desired behavior.)

      How would you expect to score "beard" and "bread"? How about an example where no two letters are in the same position, like "peach" and "cheap"? By my reasoning, both pairs would be 5-close even though they aren't equal.

      After looking at your example again, "aabcc" and "abbbc", I suspect you think those should be considered 3-close (based on number of unique chars rather than length.) I suppose that has some merit, but it would result in any combination of "case", "cases", "cease", and "ceases" being 4-close. I think it makes sense that "case" and "cease" would be 4-close but "cases" and "cease" would be 5-close.

      Like you say, we can't really know without some input from the original poster... sigh.

      -sauoq
      "My two cents aren't worth a dime.";
      

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://225379]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (1)
As of 2024-04-25 12:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found