http://qs321.pair.com?node_id=761053

"To be competitive at Perl golf, you have to be a Perl expert, with years of language experience" -- Yorkey

"Rubbish! In TPR 3, Mickut of Finland was on the leaderboard two hours after he learnt Perl! And Perl expert Petdance, despite many years of Perl experience, finished more than 80 strokes behind the leaders in the Fonality golf challenge" -- me

-- Yorkey and me arguing during a lunchtime walk to Kirribilli

Were it not for this chance argument during a lunchtime walk with my workmate and good friend, Yorkey, I probably would have given up the Roman to Decimal golf game after a few weeks. After all, I was completely stuck at that time with my Perl solution and had no fresh ideas to try out. However, Yorkey's stubborn refusal to change his point of view provoked me into proving my point to him by playing this golf in languages I hardly knew, namely Ruby, Python and PHP.

That's not my earring! ... All right, what's going on?

-- My (non-hacker) wife after finding audreyt's earring sitting on our bedside table

I decided to start with Ruby because I at least had a passing familiarity with that language after audreyt stayed for a few days at my home, during the heady days of the Pugs project, while my wife was out of town. During her stay, we went through the library section of the Pickaxe book together because Audrey felt it would make a great model for documenting the Perl 6 libraries. Unfortunately, the absent-minded Audrey left one of her earrings behind on our bedside table and I assumed it belonged to my wife, so just left it there. On her return, my wife saw the earring sitting on her bedside table and freaked. I had previously told my wife that Perl hacker "autrijus" was coming to stay for a few days while she was away. Luckily, when I hastily explained that autrijus had become audrey, my wife judged it unlikely that I would invent such a story and quickly calmed down. :-)

After a full month of play, the Perl leader was robin on 60 strokes, with Ruby languishing far behind on 73. So I naturally thought it "impossible" for Ruby to overtake Perl in this game -- and ludicrous to suggest that I might be able to beat my Perl solution in Ruby. My expectations were much lower than that; I simply wanted to be competitive in Ruby (anywhere in the 70s would be fine) so as to shut Yorkey up.

Taking the Lead in Ruby

Since I'd already worked out some basic magic formulae by that time, I naturally started converting these to work with Ruby. Unfortunately, in addition to mapping M -> 1000, D -> 500, C -> 100, L -> 50, X -> 10, V -> 5, I -> 1, I needed to further map the trailing newline to zero because I could find no short way of removing it in Ruby. In Perl, the trailing newline was easily removed via /./g. This extra newline mapping invalidated most of the Perl magic formulae I had previously found, so I had to adjust my magic formula searcher and start searching all over again.

This program might contain the fastest known existing implementation of full forward crypt

-- Ton Hospel sensationally wins a Perl golf tournament by writing the fastest C program

I won't bore you here with the gory details of my magic formula searcher, written in C, for speed. To get a feel for what these searching programs look like, take a look at my simple one, described in Golf: Magic Formula for Roman Numerals, or Ton's much more complex one.

My new and improved search program found a Ruby-friendly magic formula easily enough, and I was flabbergasted when my first Ruby approach, despite using the nine character each_byte method, was equal leader on 73 strokes!

p=t=0 $<.each_byte{|c|n=10**(238%c%19%4)>>-c%29%2;p<n&&t-=2*p;t+=p=n} p t
As you can see, this is just a straightforward translation of the original algorithm I was using in Perl, albeit with a magic formula replacing the Perl regex-based lookup table. As I was quick to point out to Yorkey, I didn't need to be a Ruby expert to do this, just needed to know the core parts of the language and, more importantly, find a good algorithm.

I started with each_byte only because I couldn't get the shorter getc function to work. For example, this attempt:

n=t=0 t+=n-n*(n<(n=10**(238%c%19%4)>>-c%29%2)?2:0)while c=getc p t
failed to compile with "undefined local variable or method `c'". Following Eugene's advice of "Can't possibly work, try it anyway", I changed c to C (uppercase variables are constants in Ruby):
n=t=0 t+=n-n*(n<(n=10**(238%C%19%4)>>-C%29%2)?2:0)while C=getc p t
and it worked! The screen was littered with "warning: already initialized constant C" messages (written to stderr), but these don't matter to codegolf, which only cares about what is written to stdout. Combining with a well-known Ruby golfing trick of replacing the three-char 238 with the two-char ?ascii-char-with-ord-of-238, shortened my solution to 65. As you might expect, I felt elated at leading the Ruby experts by eight strokes! And, more importantly, forcing Yorkey to eat his words.

Choking on my breakfast cereal

Complacency is dangerous in golf and I had become complacent. If anyone had told me at this time that I could reduce my Ruby solution from 65 strokes all the way down to 53, I would have declared them insane.

After basking for months in my newly acquired Ruby fame, I almost choked on my breakfast cereal when I checked the codegolf leaderboard one morning and noticed that Python golfing god, Mark Byers, had posted a 59 stroke Ruby solution. This was intolerable! Back to work.

After experimenting some more with Ruby's evaluation order, I came up with a weird spaceship operator 60 stroke solution:

n=t=0;t+=n.*1+n<=>n=10**(238%C%19%4)>>-C%29%2while C=getc;p t
I've left the 238 above for readability, but my submitted solution naturally used the ?ascii-char-with-ord-of-238 dirty trick mentioned earlier. This solution introduces another dirty Ruby golfing trick, namely using a .* "method call" for "multiplies" rather than *(...), thus saving a stroke by eliminating the parens. You can try this trick routinely when golfing in Ruby whenever you need to change operator precedence -- though it doesn't always work, Ruby's parsing being pretty quirky, in my experience. By the way, it was this Ruby solution that inspired my weird 62 stroke Perl spaceship operator solution, mentioned in the previous article, an example of transferring ideas from one language to another. Often the hard part in golf is generating new ideas to try, and using multiple languages is a fertile source of fresh ideas.

Alas, I couldn't improve this solution further, so switched to Python, hoping to take revenge on the "Python golfing god" there.

Python Baby Steps

As you might expect by now, my first Python attempt was the same ol' same ol':

t=p=0 for c in map(ord,raw_input()):n=10**(84169%c%5)>>-c%29%2;t+=n-2*p*(p<n +);p=n print t
89 strokes! This solution bears a close resemblance to the earlier Ruby ones. Notice that Python, like Perl, but unlike Ruby, does not need to map the trailing newline because the Python raw_input function removes it.

Two further strokes were whittled easily enough with:

t=p=0 for c in raw_input():x=84169%ord(c);n=10**(x%5)>>x/4%2;t+=n-2*p*(p<n); +p=n print t
Notice too that in Python, alone among the four languages, assignment is not an operator. This proved a chronic nuisance in this game because I couldn't see any opportunity to exploit evaluation order to eliminate the "previous value" variable (p in the Python solution above).

Another generally applicable golfing tip is to study every single built-in function the language has to offer, especially the short ones. When I did that, the Python hash function caught my eye. I wonder if it could be used in a magic formula? Well, it seems to have better properties for this purpose than ord and is only one stroke longer. Definitely worth a try. It did indeed improve things:

t=p=0 for c in map(hash,raw_input()):n=10**(c/619%4)>>c%8/5;t+=n-2*p*(p<n);p +=n print t
... but only by one stroke. 86 strokes now, but still a gaping eight strokes behind the Python golfing god.

Going for the Outright Lead

Necessity is the mother of invention

The Python solutions are different to the Ruby and Perl ones in that you have to either map the hash/ord functions, or assign them to a variable, as in x=84169%ord(c), because all the magic formulae seen so far use the character twice. It occurred to me therefore, that if I could find a magic formula that used each character in the input string once only that would be a big saving in Python. How to find such a formula? I have no idea, but I played around one afternoon, just trying stuff, and stumbled on a gem:

t=p=0 for r in raw_input():n=10**(205558%ord(r)%7)%9995;t+=n-2*p%n;p=n print t

By way of explanation, notice that the magic formula 205558%ord(r)%7 maps M -> 3, D -> 6, C -> 2, L -> 5, X -> 1, V -> 4, I -> 0 as shown in the following table:

Romanm10**m10**m%9995
M310001000
D61000000500
C2100100
L510000050
X11010
V4100005
I011

Generally, formulae that map M -> 3, C-> 2, X -> 1 and I -> 0 are highly effective because applying "%NNNN", where NNNN > 1000, does not mangle the already matching 10**m, so instead of requiring seven lucky hits, you now need only three (D, L and V).

Combining this new formula with the same modulo trick I used to move my Perl solution from 62 to 60 strokes reduced my Python solution to 78 strokes and tied for the lead with Mark Byers!

Code Golf is 10% strategy, 90% tactics

Actually, I've found many different 78 stroke Python solutions, but none shorter. Here are some more variations in the middle line:

n=10**(hash(r+"*N_")%9)%2857;t+=n-2*p%n;p=n n=10**(hash(r+"@M4")%7)%9995;t+=n-2*p%n;p=n n=10**(hash(r*37509)%7)%9995;t+=n-2*p%n;p=n n=10**"IXCMVLD".find(r)%9995;t+=n-2*p%n;p=n n=10**(494254%ord(r)/9)%4999;t+=n/2-p%n;p=n
The last one is noteworthy in that it uses a different mapping, namely M -> 2000, D -> 1000, C -> 200, L -> 100, X -> 20, V -> 10, I -> 2. Also noteworthy is that, because it divides by two (n/2), it also works with a:
t=p=1
initialization. This observation will allow us later to exploit a Ruby built-in variable ($.), which is initialized to one. Note that this second alternative mapping is available, without penalty, in Ruby and Python, but not Perl and PHP, for various complicated tactical reasons. These are the sort of tactical tricks that are crucial when fighting for the lead in golf.

Incredibly, applying what I learnt in my Python diversion to Ruby, plus yet another dirty Ruby trick (using the Perl-inspired Ruby built-in variable $. to eliminate the t=0), enabled me to reduce my Ruby solution from 60 strokes all the way down to 53 and so steal the outright lead from "primo":

n=1;$.+=n/2-n%n=10**(494254%C/9)%4999while C=getc;p$.

Success is never final -- Winston Churchill

Of course, I can't prove that I've found the optimal magic formula. It's also likely that further language or algorithmic golfing improvements will be found, especially given my relative inexperience in Ruby and Python.

In the next installment of this series, I'll show off my PHP solutions.

Leaderboards, end of April 2009

All languages (281 entries):

1st 53 eyepopslikeamosquito Ruby 2nd 55 primo Ruby 3rd 56 flagitious Ruby 4th 58 ySas Perl 5th 58 leonid Ruby 6th 59 bearstearns Perl 7th 59 Mark Byers Ruby 8th 59 kounoike Perl 9th 60 robin Perl 10th 61 arpad Perl

Perl (69 entries):

1st 55 eyepopslikeamosquito 2nd 58 ySas 3rd 59 bearstearns 4th 59 kounoike 5th 60 robin 6th 61 arpad 7th 61 shinh 8th 62 0xF 9th 66 ersagun 10th 66 redneval 11th 66 o0lit3 12th 66 Aidy 13th 67 szeryf 14th 70 ott 15th 73 jojo 16th 73 yvl 17th 73 acura 18th 77 Jasper 19th 78 agenticarus 20th 79 tybalt89 21st 82 yojeb 22nd 82 grizzley 23rd 85 twice11 24th 86 yanick-walloper 25th 87 Ciaran 26th 87 olivier 27th 88 justin 28th 91 SubStack 29th 93 wendelscardua 30th 94 Trinary 31st 95 tripa 32nd 95 dropbear 33rd 95 sprimmer 34th 98 yanick 35th 99 chargrill 36th 99 kjan 37th 99 Jocelyn 38th 100 k12u 39th 102 duranain 40th 104 sphx95

Ruby (86 entries):

1st 53 eyepopslikeamosquito 2nd 55 primo 3rd 56 flagitious 4th 58 leonid 5th 59 Mark Byers 6th 71 shinh 7th 71 tryeng 8th 73 yvl 9th 73 bitsweat 10th 76 ozy4dm

Python (87 entries):

1st 78 Mark Byers 2nd 78 eyepopslikeamosquito 3rd 79 tryeng 4th 79 hallvabo 5th 80 tha 6th 80 primo 7th 81 BjarkeEbert 8th 85 hiro.suzuki 9th 91 mick 10th 91 gtalpo

PHP (62 entries):

1st 70 eyepopslikeamosquito 2nd 89 hiro.suzuki 3rd 89 Methedrine 4th 89 W 5th 89 Trinary 6th 90 arpad 7th 90 angpoo 8th 91 rollercoaster375 9th 92 El Hombre Gris 10th 92 morten

Leaderboard Update

One month later, the leaderboard changes are:

hallvabo Python 79 -> 72 hendrik Python 94 -> 90 d3m3vilurr Python 108 -> 101 yvl Perl 73 -> 61 falsetru Python 216 -> 95 Leinad Perl -> 152 eyepopslikeamosquito Python 78 -> 72 eyepopslikeamosquito PHP 70 -> 68 Kalindor PHP -> 93 tomatoring Python -> 295

Update: shortly before the codegolf web site shut down in 2014, final leaderboard improvements were:

eyepopslikeamosquito Perl 55 -> 53 (see [id://853502]) eyepopslikeamosquito Python 72 -> 71 (see [id://1083046]) eyepopslikeamosquito PHP 68 -> 63 (see [id://836741])

References