Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Fun with words (Golf?)

by McDarren (Abbot)
on Nov 09, 2005 at 07:48 UTC ( [id://506997]=perlmeditation: print w/replies, xml ) Need Help??

Howdy :)

Update:Approximately 30 seconds after posting this (arg!), I found this link which talks about this phenomena.

(Apologies in advance if this has been covered previously)

Some people may be aware of the phenomena whereby if you take a phrase and rearrange the letters in each word - but keep the first and last letters in place - then the phrase as a whole remains readable. (There is probably a name for this, but I don't know what it is).

As a novice Perl programmer, I thought it would be a fun exercise to see how I could take a series of phrases and "scramble" them in this way.

So this is my attempt at it:
#!/usr/bin/perl -w use strict; use List::Util qw(shuffle); while (<DATA>) { my @words = split; WORD: for my $word (@words) { my $offset = $word =~ /[\.\,]$/ ? 2 : 1; my @letters = split //, $word; my $numletters = @letters; next WORD if $numletters < 4; $word = join("", $letters[0], shuffle(@letters[1 .. $#letters-$offset]), @letters[-$offset .. -1] ); } print "@words\n"; } __DATA__ Down that path lies madness. On the other hand, the road to hell is pa +ved with melting snowballs. Although the Perl Slogan is There's More Than One Way to Do It, I hesi +tate to make 10 ways to do something. And don't tell me there isn't one bit of difference between null and s +pace, because that's exactly how much difference there is. Randal said it would be tough to do in sed. He didn't say he didn't un +derstand sed. Randal understands sed quite well. Which is why he uses + Perl. As usual, I'm overstating the case to knock a few neurons loose, but t +he truth is usually somewhere in the muddle, uh, middle. Of course, I reserve the right to make wholly stupid changes to Perl i +f I think they improve the language.
Of course, the output is different on every run - but here is an example:
Dwon that ptah leis msenads. On the oethr hand, the raod to hell is pv +aed with mlinetg sblwlonas. Agohutlh the Perl Sgalon is Treeh's More Tahn One Way to Do It, I hasi +tete to mkae 10 wyas to do snmoiehtg. And dn'ot tell me there in'st one bit of diferefcne beweetn nlul and s +cpae, becasue ta'hts elxctay how much drcinefefe tehre is. Rdanal said it would be tough to do in sed. He dd'nit say he dd'int un +datenrsd sed. Ranadl uenadnsdrts sed qitue wlel. Wchih is why he uess + Prel. As uausl, I'm osvtneatirg the case to kocnk a few nurnoes lsooe, but t +he ttruh is uualsly semwerohe in the mdudle, uh, mldide. Of cusore, I reevsre the rhgit to mkae wohlly sptiud cgeanhs to Perl i +f I thnik they ivopmre the lgagunae.

For those that are interested, I'd be really keen to see how well this can be "golfed". The rules are simple:
  1. Take a single line of text (a phrase - in this case, some Larry Wall quotes) as input.
  2. Split the phrase into words (on whitespace).
  3. Skip any words containing less than 4 letters.
  4. Shuffle the letters in each word, keeping the first and last in place.
  5. If a "word" ends with a comma/period, keep the comma/period in place as well as the preceeding letter.
  6. Reconstruct the phrase with the shuffled words and print it.

Cheers,
--Darren :)

Replies are listed 'Best First'.
Re: Fun with words (Golf?)
by BrowserUk (Patriarch) on Nov 09, 2005 at 15:37 UTC

    75

    s[\B([^\s]*)\B(?=\w)][join'',sort{.5>rand}split'',$1]ge,print while<DA +TA>;

    72

    s[\B(\S*)\B(?=\w)][join'',sort{.5>rand}split'',$1]ge,print while<DATA> +;

    66

    s[\B(\w+)\B][join'',sort{.5>rand}split'',$1]ge,print while<DATA>;
    P:\test>junk Down taht path leis madenss. On the ohter hnad, the road to hlel is pe +avd with mnilteg slwalbnos. Augltohh the Prel Sloagn is Tehre's More Than One Way to Do It, I htsi +etae to mkae 10 wyas to do setmonihg. And don't tlel me there isn't one bit of drecfefine between null and s +pace, bsauece taht's etcaxly how much dreneciffe there is. Rdnaal siad it wluod be tguoh to do in sed. He didn't say he didn't ud +nerasntd sed. Rnaadl udnertsands sed qtuie well. Wchih is why he uess + Perl. As usual, I'm oitasrvetng the case to kcnok a few nnrueos loose, but t +he truth is ullasuy sreeomwhe in the mudlde, uh, mlddie. Of crouse, I rseerve the right to mkae wohlly siuptd changes to Prel i +f I tnhik they ivrompe the lgnuagae.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      ++ for realizing that a fair shuffle isn't necessary. You can trim a few chars by getting rid of the rand call entirely, and using non-paired delimiters, $&, and an alternative to split:
      s!\B\w+\B!join'',sort{1}$&=~/./g!eg

      Caution: Contents may have been coded under pressure.

        Come to that, why bother with the comparison block at all :)

        s!\B\w+\B!join'',sort$&=~/./g!eg P:\test>junk Down taht path leis madenss. On the oehtr hand, the raod to hell is pa +evd with meilntg sabllnows. Aghlotuh the Perl Saglon is Tehre's More Tahn One Way to Do It, I haei +stte to make 10 ways to do sehimnotg. And don't tell me tehre isn't one bit of dceeffinre beeetwn nlul and s +acpe, bacesue taht's eacltxy how mcuh dceeffinre tehre is. Raadnl said it wloud be tgouh to do in sed. He ddin't say he ddin't ua +dennrstd sed. Raadnl uaddennrsts sed qitue well. Wchih is why he uess + Perl. As uasul, I'm oaeinrsttvg the case to kcnok a few nenorus loose, but t +he trtuh is uallsuy seehmorwe in the mddlue, uh, mddile. Of corsue, I reersve the rghit to make whlloy siptud caeghns to Perl i +f I think tehy imoprve the laaggnue.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      66
      s[\B(\w+)\B][join'',sort{.5>rand}split'',$1]ge,print while<DATA>;
      Some common golfing micro-"optimizations":
      s|\B\w+\B|join'',sort{.5>rand}$&=~/./g|ge,print for<DATA>
Re: Fun with words (Golf?)
by hossman (Prior) on Nov 09, 2005 at 08:16 UTC
    1. There's no reason to explicitly skip words less then 4 characters. if it's less then four, shuffling the "middle letters" will either be a shuffle of one, or zero characters.
    2. This doesn't solve the problem exactly as you described (your requirements about comma,period are pretty specific, and don't cover other punctuation), but it's pretty close...
      #!perl -p use List::Util 'shuffle'; $_=~s{(?<=\b\w)(\w+)(?=\w\b)}{join'',shuffle split//,$1}xeg;
      It's by no means a good Golf ... but it's a good starting point.
    I'm not sure why i posted it in that format, this makes more sense if we're talking about golf...
    perl -MList::Util=shuffle -pe '$_=~s{(?<=\b\w)(\w+)(?=\w\b)}{join"",sh +uffle split//,$1}xeg;'
      You have more rigor in your regex than is needed.
      perl -MList::Util=shuffle -pe 's:\B\w+\B:join"",shuffle split//,$&:ge'

      Caution: Contents may have been coded under pressure.

        Look Ma, no modules!

        perl -pe's!\B\w+\B!@s=split//,$&;my$s;$s.=splice@s,rand@s,1while@s;$s! +eg'
        Three chars for List::Util is a good trade, I think... :-)

        -sauoq
        "My two cents aren't worth a dime.";
        

        ahh, \B ... I can't remember the last time i had a reason to use that ... nice catch.

Re: Fun with words (Golf?)
by GrandFather (Saint) on Nov 09, 2005 at 09:08 UTC

    The following uses pretty much the original code, but dispenses with the naughty use of a module:

    while(<DATA>){@x=split;for(@x){$o=/[.,]$/?2:1;$m=substr$_,0,1,''; $e=substr$_,-$o,$o,'';$m.=substr$_,rand(length),1,''while$_;$_="$m$e"} print"@x\n"}

    Update: remove a few more chars, and some more (Thanks Bart), and some more (ditto)


    Perl is Huffman encoded by design.
      using s/// to detect the punctuation, and it can be shorter still:
      while(<DATA>){for(@x=split){s/([.,]?)$//;$s=substr$_,0,1,''; $e=chop;$s.=substr$_,rand(length),1,''while$_;$_="$s$e$1"} print"@x\n"}
Re: Fun with words (Golf?)
by g0n (Priest) on Nov 09, 2005 at 11:59 UTC
    Never tried golfing before, it's fun! This looked like something that called for a regular expression. I've used reverse here to perform a pseudo shuffle:

    Update: took out reverse in favour of a random shuffle

    while (<DATA>) { $_=~s/([.,\s]*?)(\w{1})(\w+)(\w{1})([,.\s]*?)(?{$j=sub{$x=shift;for(1 +..50){$n=int(rand(length($x)-1));$x=reverse(substr($x,0,$n)).substr($ +x,$n);}return $x };$x=$j->($3)})/$1$2$x$4$5/g; print $_; } __DATA__ Down that path lies madness. On the other hand, the road to hell is pa +ved with melting snowballs. Although the Perl Slogan is There's More Than One Way to Do It, I hesi +tate to make 10 ways to do something. And don't tell me there isn't one bit of difference between null and s +pace, because that's exactly how much difference there is. Randal said it would be tough to do in sed. He didn't say he didn't un +derstand sed. Randal understands sed quite well. Which is why he uses + Perl. As usual, I'm overstating the case to knock a few neurons loose, but t +he truth is usually somewhere in the muddle, uh, middle. Of course, I reserve the right to make wholly stupid changes to Perl i +f I think they improve the language.

    Update: Slightly OT for perl, but IIRC, this phenomenon is theorised to derive from 'orthographic word recognition', recognising the entire word by it's overall shape. As long as the outline of the word is roughly the same, the brain can make a pretty good stab at recognising it. It will probably also be helped by a priming effect, whereby the words most likely to follow a given word will be more 'activated'. There's some research (somewhere that is, I'm trying to track it down) suggesting that the more different the outline is from the original (especially through the placement of rising and falling letters like p and l) the more difficult it will be to recognise.

    --------------------------------------------------------------

    "If there is such a phenomenon as absolute evil, it consists in treating another human being as a thing."

    John Brunner, "The Shockwave Rider".

Re: Fun with words (Golf?)
by Roger (Parson) on Nov 09, 2005 at 08:27 UTC
    Here's my first entry...
    local $"; while (<DATA>) { s/(\w{4,})/@l=split'',$1;@m=@l[1..$#l-1];@r=map{splice@m,rand @m,1 +}@m;"$l[0]@r$l[-1]"/ge; print; } __DATA__ Down that path lies madness. On the other hand, the road to hell is pa +ved with melting snowballs. Although the Perl Slogan is There's More Than One Way to Do It, I hesi +tate to make 10 ways to do something. And don't tell me there isn't one bit of difference between null and s +pace, because that's exactly how much difference there is. Randal said it would be tough to do in sed. He didn't say he didn't un +derstand sed. Randal understands sed quite well. Which is why he uses + Perl. As usual, I'm overstating the case to knock a few neurons loose, but t +he truth is usually somewhere in the muddle, uh, middle. Of course, I reserve the right to make wholly stupid changes to Perl i +f I think they improve the language.
Re: Fun with words (Golf?)
by japhy (Canon) on Nov 09, 2005 at 14:47 UTC
    In regards to the link that itub posted, one my sisters pointed out that the email that circulated was contradictory. The text of the message is:
    Aoccdrnig to rscheearch at Cmabrigde uinervtisy, it deosn't mttaer waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteres are at the rghit pclae. The rset can be a tatol mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.
    The two words in bold have no unjumbled English equivalent, and yet we read them as "research" (only one 'h') and "important" (and 'a' instead of an 'e') respectively.

    That aside, it's interesting to see why this phenomenon seems to exist. I ran the following program against my /usr/dict/words file:

    #!/usr/bin/perl use strict; use warnings; my %words; while (<>) { chomp($_ = lc); for (/\b\w+(?:'\w+)?\b/g) { next unless length > 3; my $order = join "", sort split //; $words{substr($_, 0, 1) . substr($_, -1, 1)}{$order}{$_} = 1; } } for my $k (sort keys %words) { my @multi = grep keys %{ $words{$k}{$_} } > 1, keys %{ $words{$k} } +or next; print "$k\n"; for my $o (sort @multi) { my $w = $words{$k}{$o}; print " $o (" . keys(%$w) . ") @{[ keys %$w ]}\n"; } }
    I get back that there are 1058 pairs of words which are anagrams and that have their first and last letters in common. There are only 52 triplets of such words, and only 1 quadruplet. This is out of 95226 words with 4 or more characters in my /usr/dict/words file.

    I ran this on my local copy of the King James bible (doesn't everyone have one of those?) and came up with only 60 pairs of words (and no triplets, quadruplets, etc.).


    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Fun with words (Golf?)
by planetscape (Chancellor) on Aug 20, 2006 at 12:56 UTC

    Actually, this has appeared on PM before:

    Txet Maglning Glof, Ayobndy?

    (And I submit it now to help future searchers... and because it's cool ;-) )

    HTH,

    planetscape
Re: Fun with words (Golf?)
by itub (Priest) on Nov 09, 2005 at 13:27 UTC
    I tried to read the scrambled versions first, and got stuck on "mlinetg sblwlonas". So it's not always 100% readable to everyone. ;-)

    Another relevant link is http://www.snopes.com/language/apocryph/cambridge.asp , which includes an example of an email that people forwarded compulsively a couple of years ago. I don't know how many times I got it...

Re: Fun with words (Golf?)
by swampyankee (Parson) on Nov 09, 2005 at 14:59 UTC

    Being almost entirely monolingual (a Bad Thing,imho) and Anglophonic, I wonder if the same thing happens in other languages written in something resembling an alphabet (e.g., not Chinese). Darren's code should work as well on data in, say, German or French.

    emc

Re: Fun with words (Golf?)
by uksza (Canon) on Nov 09, 2005 at 08:15 UTC
    Hi,

    hehe, and this is my attemp.

    Uksza

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://506997]
Approved by Roger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-25 10:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found