Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^2: Comparing Lines within a Word List

by AnomalousMonk (Archbishop)
on Apr 26, 2016 at 23:02 UTC ( [id://1161596]=note: print w/replies, xml ) Need Help??


in reply to Re: Comparing Lines within a Word List
in thread Comparing Lines within a Word List

Actually, bitwise-xor on strings and  tr/// (update: see Quote-Like Operators in perlop) go together quite nicely for something like this:

c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(pp); ;; for my $word (qw(Fool Foot Tool Toot Foal)) { my $diff = 'Fool' ^ $word; print qq{'$word': }, pp $diff; print qq{'Fool' and '$word' differ by 1 char} if 1 == $diff =~ tr/\x00//c; } " 'Fool': "\0\0\0\0" 'Foot': "\0\0\0\30" 'Fool' and 'Foot' differ by 1 char 'Tool': "\22\0\0\0" 'Fool' and 'Tool' differ by 1 char 'Toot': "\22\0\0\30" 'Foal': "\0\0\16\0" 'Fool' and 'Foal' differ by 1 char

Update: Changed example code to use  tr/\x00//c (/c modifier: complement the search list).


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: Comparing Lines within a Word List
by dominick_t (Acolyte) on Apr 27, 2016 at 03:42 UTC
    Thank you both for the replies! I hope everyone in the thread can see this, and not just the author of the note on which I hit the reply button. Okay, so if I'm getting this right, it looks like in this example, you're taking the word 'fool' and comparing its characters to each of the five words in the array, and since 'fool' matches itself exactly, the return on that one is all zeros. Any place there is not a zero is a place where the words differ. (I'm not immediately sure why the "difference" between the character 'l' and 't' would be 30 but I'm sure it's easily explained.) So I see how this works in principle, to compare two given words and look for word pairings that yield a one-character difference. But then how might I use this to solve the problem that I have, which is to find -- from let's say a massive dictionary of English language words -- all pairs of words that are the same except for one letter, and in particular, for that character difference to be that one has an R while the other has an S? Again, many thanks.
      I hope everyone in the thread can see this, and not just the author of the note on which I hit the reply button.

      They can.

      I'm not immediately sure why the "difference" between the character 'l' and 't' would be 30 ...

      You're seeing the octal values resulting from the character-by-character bitwise-xor of two strings. So

      c:\@Work\Perl\monks>perl -wMstrict -le "printf qq{%#02o \n}, ord 'l'; printf qq{%#02o \n}, ord 't'; printf qq{%#02o \n}, 0154 ^ 0164; " 0154 0164 030

      ... the problem ... [find] from ... a massive dictionary of English language words -- all pairs of words that are the same except for one letter, and in particular, for that character difference to be that one has an R while the other has an S [in the same character position] ...
      [please note the emphasized addition]

      As to this much larger problem (as restated; please confirm this clarification — or may the differing characters be in any position? (Update: E.g., Is 'aSaa' a "match" for 'aaRa'?)): it's an interesting one, but I've no time right now to go into it in detail.

      Update: Actually, the  '02' in the  '%#02o' format specifier used in the printfs above is unnecessary, although it does no harm. The same result (and the result I wanted) can be had with  '%#o' instead.


      Give a man a fish:  <%-{-{-{-<

        To clarify: 'aSaa' is NOT a match for 'aaRa'.

        To be a match, two words must be identical position-by-position, except in one position, and in that particular position, one word has an R while the other has an S.

        Thanks also for the explanation regarding the octal value of the characters.

Re^3: Comparing Lines within a Word List
by Eily (Monsignor) on Apr 27, 2016 at 18:51 UTC

    I always forget about using tr/// for counting, thanks for the reminder :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1161596]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-25 08:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found