Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Re: Detecting transpositions

by BrowserUk (Patriarch)
on Aug 06, 2003 at 08:20 UTC ( [id://281304]=note: print w/replies, xml ) Need Help??


in reply to Re: Detecting transpositions
in thread Detecting transpositions

I hadn't actually considered Algorithm::Diff as I had assumed it to be line oriented. A brief scan of the docs show that the basic detection method employed is Longest Common Sequence, which isn't really applicable to my problem.

I've also looked at String::Approx, Text::Levenshtien and Text::Soundex, but again, these aren't geared to detecting a single transposition. They are also incredibly slow if you are trying to do this on thousands of strings.

I've also coded a simple looping comparison, and played with XOR, but I didn't want to influence the ideas. I have a vague memory of there being a clever way of doing this, but searching the web didn't elicite anything from the vague memory I have.

So, I thought I'd ask:)


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.

Replies are listed 'Best First'.
Re: Re: Re: Detecting transpositions
by dree (Monsignor) on Aug 06, 2003 at 15:58 UTC
    You can try Text::LevenshteinXS for the need for speed.

    You can also try Text::WagnerFischer and Text::Brew to play with operation's weights/costs to detect letters swap.

    There are also Text::PhraseDistance that is suited for phrases but that can benefit from a custom distance. The 0.01 version has also a dinamic technique (slow), instead the 0.02 solves the issue for the marriage problem.

    </advertising>

      It would be hard to beat Abigail-IIs implementation of comp for performance, and I don't need the full power of Levenshtien for this, but I''l be sure to grab the XS version next time I need it.

      Thanks for the pointers.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://281304]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (9)
As of 2024-04-16 09:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found