http://qs321.pair.com?node_id=289520


in reply to Match similar text

shadox,
Take a look at this node posted yesterday, though Text::Levenshtein is usually the standard answer.

I would do something like the following:

  • Set a maximum threshold, so if the closest match exceeded this threshold it would be set aside for human interaction
  • Iterate over each state calculating the similarity distance and select the shortest distance
  • Set aside for human interaction any match between two states that was close, perhaps only by a distance of 1
  • Write a log for changes until you feel confident/comfortable it is doing the right thing

    Cheers - L~R