Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Re: Detecting transpositions

by sgifford (Prior)
on Aug 06, 2003 at 08:57 UTC ( [id://281316]=note: print w/replies, xml ) Need Help??


in reply to Re: Detecting transpositions
in thread Detecting transpositions

That's cool, but it just looks for two consecutive differences, not for transposed characters. It gets confused comparing the strings "ab" and "bc", for example---it thinks they're transposed at character 0. That's probably fixable, but I'll have to think for a minute before I have a suggestion.

Replies are listed 'Best First'.
Re: Detecting transpositions
by Abigail-II (Bishop) on Aug 06, 2003 at 09:10 UTC
    The examples suggested that both strings were anagrams of each other. However, here's a fix:
    sub comp { my ($f, $s) = @_; return -1 if length ($f) != length ($s); local $_ = $f ^ $s; return -1 unless tr/\x00/1/c == 2; my $r = index $_, "11"; (substr ($f, $r, 2) eq reverse substr ($s, $r, 2)) ? $r : -1; }

    Abigail

      That does it. That's just a really cool solution. I didn't know xor worked on strings like that.
      Or even shorter... Albeit about 3 times slower on long strings and 30% faster for short...
      sub comp { local $_ = $_[0] ^ $_[1]; return s/([^\x00])\1//g==1 && !tr/\x00/1/c; }


      T I M T O W T D I
        That doesn't return the index of where the transposition happens, does it?

        Abigail

      Very cool. Can you explain why it works? What makes the XOR come out to be like that?
      A massive flamewar beneath your chosen depth has not been shown here
        Well, if you do an XOR of two strings (scalars that haven't been used as numbers), perl will do a bitwise XOR on the characters of the strings, instead of the bits on the numbers. Two characters that are the same will have all the bits the same, XORing those characters will give you the character of which all the bits are 0, aka "\x00". If two characters are not the same, there is at least one bitpositions where the bits are different - resulting in a 1 bit when XORing. Hence, the resulting character will be anything but "\x00".

        Abigail

        It works because the xor of two bits is zero if and only if the two bits are the same:
        0 ^ 0 = 0
        1 ^ 1 = 0
        0 ^ 1 = 1
        1 ^ 0 = 1
        
        So the xor of two characters (which are a string of 8 bits) will be 0 (8 0-bits) if and only if the two characters are the same.

        Using xor on strings xors the characters from the first string with the characters in the second string, so in the places where the characters are identical the resulting character will be zero, and in all places where they differ it will be nonzero.

      Hey! Thats really elegant and instructive Abigail-II - but please forgive a stupid question...
      Would it be possible for the product of $f ^ $s to contain a "1" and hence pollute your $_ string?


      "To be admired must be the constant aim of ambition."
      - Johnson
        Would it be possible for the product of $f ^ $s to contain a "1" and hence pollute your $_ string?

        Yes, but no. It could certainly be possible that the XORred string contains a character "1". But this is ok, the tr/\x00/1/c turns anything that is not "\x00" into a "1" anyway - after the tr, the string will only contain "\x00" and "1" characters.

        Abigail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://281316]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-04-18 00:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found