good chemistry is complicated, and a little bit messy -LW |
|
PerlMonks |
Re: String Comparison & Equivalence Challenge (tf-idf)by LanX (Saint) |
on Mar 14, 2021 at 07:34 UTC ( [id://11129603]=note: print w/replies, xml ) | Need Help?? |
Good morning :) > All of this depends on being able, first and foremost, to measure the equivalence of two different strings. I think you want to take a look at tf-idf (term frequency-inverse document frequency) in combination with a stemmer. And you might also want to rank partial word groups like sub-phrases to take word order into account. This should give you a start. HTH :)
Cheers Rolf
updateSee also Re^5: String Comparison & Equivalence Challenge (tf-idf)
In Section
Seekers of Perl Wisdom
|
|