perlmeditation
eyepopslikeamosquito
<P>
<blockquote>
<P>
<I>
"To be competitive at Perl golf, you have to be a Perl expert, with years of language experience" -- Yorkey
</I>
</P>
<P>
<I>
"Rubbish! In TPR 3, Mickut of Finland was on the leaderboard <B>two hours</B> after he learnt Perl! And Perl expert Petdance, despite many years of Perl experience, finished more than 80 strokes behind the leaders in the Fonality golf challenge" -- me
</I>
</P>
<P align="right">
<small>
-- Yorkey and me arguing during a lunchtime walk to Kirribilli
</small>
</P>
</blockquote>
</P>
<P>
Were it not for this chance argument during a lunchtime walk with my workmate
and good friend, Yorkey, I probably would have given up the
[id://759963|Roman to Decimal] golf game after a few weeks.
After all, I was completely stuck at that time with my Perl solution
and had no fresh ideas to try out.
However, Yorkey's stubborn refusal to change his point of view provoked me into
proving my point to him by playing this golf in languages I hardly knew,
namely Ruby, Python and PHP.
</P>
<P>
<blockquote>
<P>
<I>
That's <B>not</B> my earring! ... All right, what's going on?
</I>
</P>
<P align="right">
<small>
-- My (non-hacker) wife after finding [audreyt]'s earring sitting on our bedside table
</small>
</P>
</blockquote>
</P>
<P>
I decided to start with Ruby because I at least had a passing familiarity with that
language after [audreyt] stayed for a few days at my home,
during the heady days of the <a href="https://en.wikipedia.org/wiki/Pugs_(programming)">Pugs project</a>,
while my wife was out of town.
During her stay, we went through the library section of
<a href="http://www.pragprog.com/titles/ruby/programming-ruby">the Pickaxe book</a>
together because Audrey felt it would make a great model for documenting the Perl 6 libraries.
Unfortunately, the absent-minded Audrey left one of her earrings behind on our bedside
table and I assumed it belonged to my wife, so just left it there. On her return, my wife
saw the earring sitting on her bedside table and freaked. I had previously told my wife
that Perl hacker "autrijus" was coming to stay for a few days while she was away.
Luckily, when I hastily explained that autrijus had become audrey, my wife judged it
unlikely that I would invent such a story and quickly calmed down. :-)
</P>
<P>
After a full month of play, the Perl leader was [robin] on 60 strokes, with Ruby languishing
far behind on 73. So I naturally thought it "impossible" for Ruby to overtake Perl
in this game -- and ludicrous to suggest that I might be able to beat my Perl
solution in Ruby.
My expectations were much lower than that; I simply wanted to be competitive
in Ruby (anywhere in the 70s would be fine) so as to shut Yorkey up.
</P>
<readmore>
<P><B>Taking the Lead in Ruby</B></P>
<P>
Since I'd already worked out some basic [id://600665|magic formulae] by that time,
I naturally started converting these to work with Ruby. Unfortunately,
in addition to mapping M -> 1000, D -> 500, C -> 100, L -> 50, X -> 10, V -> 5, I -> 1,
I needed to further map the trailing newline to zero because I could
find no short way of removing it in Ruby. In Perl, the trailing newline was
easily removed via <CODE>/./g</CODE>. This extra newline mapping invalidated
most of the Perl magic formulae I had previously found, so I had to adjust my
magic formula searcher and start searching all over again.
</P>
<P>
<blockquote>
<P>
<I>
This program might contain the fastest known existing implementation of full forward crypt
</I>
</P>
<P align="right">
<small>
-- <a href="http://www.mail-archive.com/golf@perl.org/msg01568.html">Ton Hospel</a> sensationally wins a <B>Perl</B> golf tournament by writing the fastest <B>C</B> program
</small>
</P>
</blockquote>
</P>
<P>
I won't bore you here with the gory details of my magic formula searcher, written
in C, for speed.
To get a feel for what these searching programs look like, take a look at my
simple one, described in [id://600665], or
<a href="http://www.xs4all.nl/~thospel/ASIS/zlet5.tar.gz">Ton's much more complex one</a>.
</P>
<P>
My new and improved search program found a Ruby-friendly magic formula easily enough,
and I was flabbergasted when my first Ruby approach, despite using the <I>nine</I> character
<CODE>each_byte</CODE> method, was equal leader on 73 strokes!
<CODE>
p=t=0
$<.each_byte{|c|n=10**(238%c%19%4)>>-c%29%2;p<n&&t-=2*p;t+=p=n}
p t
</CODE>
As you can see, this is just a straightforward translation of the original
algorithm I was using in Perl, albeit with a magic formula replacing the
Perl regex-based lookup table. As I was quick to point out to Yorkey, I didn't
need to be a Ruby expert to do this, just needed to know the core parts
of the language and, more importantly, <I>find a good algorithm</I>.
</P>
<P>
I started with <CODE>each_byte</CODE> only because I couldn't get the shorter
<CODE>getc</CODE> function to work. For example, this attempt:
<CODE>
n=t=0
t+=n-n*(n<(n=10**(238%c%19%4)>>-c%29%2)?2:0)while c=getc
p t
</CODE>
failed to compile with "undefined local variable or method `c'".
Following Eugene's advice of "Can't possibly work, try it anyway", I
changed <CODE>c</CODE> to <CODE>C</CODE> (uppercase variables are
constants in Ruby):
<CODE>
n=t=0
t+=n-n*(n<(n=10**(238%C%19%4)>>-C%29%2)?2:0)while C=getc
p t
</CODE>
and it worked! The screen was littered with "warning: already initialized constant C"
messages (written to stderr), but these don't matter to codegolf, which only cares
about what is written to stdout. Combining with a well-known Ruby golfing
trick of replacing the three-char <CODE>238</CODE> with the two-char
<CODE>?ascii-char-with-ord-of-238</CODE>, shortened my solution to 65.
As you might expect, I felt elated at leading the Ruby experts by eight strokes!
And, more importantly, forcing Yorkey to eat his words.
</P>
<P><B>Choking on my breakfast cereal</B></P>
<P>
Complacency is dangerous in golf and I had become complacent.
If anyone had told me at this time that I could reduce my Ruby solution
from 65 strokes all the way down to 53, I would have declared them insane.
</P>
<P>
After basking for months in my newly acquired Ruby fame, I almost choked
on my breakfast cereal when I checked the codegolf leaderboard one morning and noticed that
<a href="http://codegolf.com/boards/conversation/view/201">Python golfing god, Mark Byers</a>,
had posted a 59 stroke Ruby solution.
This was intolerable! Back to work.
</P>
<P>
After experimenting some more with Ruby's evaluation order, I came up with
a weird spaceship operator 60 stroke solution:
<CODE>
n=t=0;t+=n.*1+n<=>n=10**(238%C%19%4)>>-C%29%2while C=getc;p t
</CODE>
I've left the <CODE>238</CODE> above for readability, but my submitted solution
naturally used the <CODE>?ascii-char-with-ord-of-238</CODE> dirty trick mentioned earlier.
This solution introduces another dirty Ruby golfing trick, namely using
a <CODE>.*</CODE> "method call" for "multiplies" rather than <CODE>*(...)</CODE>, thus
saving a stroke by eliminating the parens. You can try this trick routinely
when golfing in Ruby whenever you need to change operator precedence -- though
it doesn't always work, Ruby's parsing being pretty quirky, in my experience.
By the way, it was this Ruby solution that inspired my weird 62 stroke Perl spaceship operator
solution, mentioned in the previous article, an example of transferring ideas from
one language to another. Often the hard part in golf is generating new ideas to try,
and using multiple languages is a fertile source of fresh ideas.
</P>
<P>
Alas, I couldn't improve this solution further, so switched to Python, hoping to
take revenge on the "Python golfing god" there.
</P>
<P><B>Python Baby Steps</B></P>
<P>
As you might expect by now, my first Python attempt was the same ol' same ol':
<CODE>
t=p=0
for c in map(ord,raw_input()):n=10**(84169%c%5)>>-c%29%2;t+=n-2*p*(p<n);p=n
print t
</CODE>
89 strokes! This solution bears a close resemblance to the earlier Ruby ones.
Notice that Python, like Perl, but unlike Ruby, does not need to map the
trailing newline because the Python <CODE>raw_input</CODE> function removes it.
</P>
<P>
Two further strokes were whittled easily enough with:
<CODE>
t=p=0
for c in raw_input():x=84169%ord(c);n=10**(x%5)>>x/4%2;t+=n-2*p*(p<n);p=n
print t
</CODE>
Notice too that in Python, alone among the four languages, assignment is not an operator. This proved a chronic nuisance in this game because I couldn't see any opportunity
to exploit evaluation order to eliminate the "previous value" variable (<CODE>p</CODE>
in the Python solution above).
</P>
<P>
Another generally applicable golfing tip is to study every single built-in function the
language has to offer, especially the short ones. When I did that, the Python
<CODE>hash</CODE> function caught my eye. I wonder if it could be used in
a magic formula? Well, it seems to have better properties for this
purpose than <CODE>ord</CODE> and is only one stroke longer.
Definitely worth a try. It did indeed improve things:
<CODE>
t=p=0
for c in map(hash,raw_input()):n=10**(c/619%4)>>c%8/5;t+=n-2*p*(p<n);p=n
print t
</CODE>
... but only by one stroke.
86 strokes now, but still a gaping eight strokes behind the Python golfing god.
</P>
<P><B>Going for the Outright Lead</B></P>
<P>
<blockquote>
<I>
Necessity is the mother of invention
</I>
</blockquote>
</P>
<P>
The Python solutions are different to the Ruby and Perl ones in that
you have to either <CODE>map</CODE> the <CODE>hash/ord</CODE> functions,
or assign them to a variable, as in <CODE>x=84169%ord(c)</CODE>, because
all the magic formulae seen so far use the character <I>twice</I>.
It occurred to me therefore, that if I could find a magic formula that
used each character in the input string <I>once only</I> that would be a big
saving in Python. How to find such a formula? I have no idea, but I
played around one afternoon, just trying stuff, and stumbled on a gem:
<CODE>
t=p=0
for r in raw_input():n=10**(205558%ord(r)%7)%9995;t+=n-2*p%n;p=n
print t
</CODE>
</P>
<P>
By way of explanation, notice that the magic
formula <CODE>205558%ord(r)%7</CODE> maps
M -> 3, D -> 6, C -> 2, L -> 5, X -> 1, V -> 4, I -> 0
as shown in the following table:
</P>
<P>
<table border="1">
<tr><th>Roman</th><th>m</th><th>10**m</th><th>10**m%9995</th></tr>
<tr><td>M</td><td>3</td><td>1000</td><td>1000</td></tr>
<tr><td>D</td><td>6</td><td>1000000</td><td>500</td></tr>
<tr><td>C</td><td>2</td><td>100</td><td>100</td></tr>
<tr><td>L</td><td>5</td><td>100000</td><td>50</td></tr>
<tr><td>X</td><td>1</td><td>10</td><td>10</td></tr>
<tr><td>V</td><td>4</td><td>10000</td><td>5</td></tr>
<tr><td>I</td><td>0</td><td>1</td><td>1</td></tr>
</table>
</P>
<P>
Generally, formulae that map M -> 3, C-> 2, X -> 1 and I -> 0 are
highly effective because applying "%NNNN", where NNNN > 1000, does
not mangle the already matching 10**m, so instead of requiring seven
lucky hits, you now need only three (D, L and V).
</P>
<P>
Combining this new formula with the same modulo trick I used to move my Perl
solution from 62 to 60 strokes reduced my Python
solution to 78 strokes and tied for the lead with Mark Byers!
</P>
<P>
<blockquote>
<I>
Code Golf is 10% strategy, 90% tactics
</I>
</blockquote>
</P>
<P>
Actually, I've found many different 78 stroke Python solutions, but none shorter.
Here are some more variations in the middle line:
<CODE>
n=10**(hash(r+"*N_")%9)%2857;t+=n-2*p%n;p=n
n=10**(hash(r+"@M4")%7)%9995;t+=n-2*p%n;p=n
n=10**(hash(r*37509)%7)%9995;t+=n-2*p%n;p=n
n=10**"IXCMVLD".find(r)%9995;t+=n-2*p%n;p=n
n=10**(494254%ord(r)/9)%4999;t+=n/2-p%n;p=n
</CODE>
The last one is noteworthy in that it uses a different mapping, namely
M -> 2000, D -> 1000, C -> 200, L -> 100, X -> 20, V -> 10, I -> 2.
Also noteworthy is that, because it divides by two (<CODE>n/2</CODE>),
it also works with a:
<CODE>
t=p=1
</CODE>
initialization. This observation will allow us later to exploit a Ruby built-in variable (<CODE>$.</CODE>),
which is initialized to one.
Note that this second alternative mapping is available, without penalty, in Ruby and Python,
but not Perl and PHP, for various complicated tactical reasons.
These are the sort of tactical tricks that are crucial when fighting for the lead in golf.
</P>
<P>
Incredibly, applying what I learnt in my Python diversion to Ruby, plus yet another
dirty Ruby trick (using the Perl-inspired Ruby built-in variable <CODE>$.</CODE> to
eliminate the <CODE>t=0</CODE>), enabled me to reduce my Ruby solution from 60 strokes
all the way down to 53 and so steal the outright lead from "primo":
<CODE>
n=1;$.+=n/2-n%n=10**(494254%C/9)%4999while C=getc;p$.
</CODE>
</P>
<P>
<blockquote>
<I>
Success is never final -- Winston Churchill
</I>
</blockquote>
</P>
<P>
Of course, I can't prove that I've found the optimal magic formula.
It's also likely that further language or algorithmic golfing improvements will be found,
especially given my relative inexperience in Ruby and Python.
</P>
<P>
In the next installment of this series, I'll show off my PHP solutions.
</P>
<P><B>Leaderboards, end of April 2009</B></P>
<P>
<B>All languages</B> (281 entries):
<CODE>
1st 53 eyepopslikeamosquito Ruby
2nd 55 primo Ruby
3rd 56 flagitious Ruby
4th 58 ySas Perl
5th 58 leonid Ruby
6th 59 bearstearns Perl
7th 59 Mark Byers Ruby
8th 59 kounoike Perl
9th 60 robin Perl
10th 61 arpad Perl
</CODE>
</P>
<P>
<B>Perl</B> (69 entries):
<CODE>
1st 55 eyepopslikeamosquito
2nd 58 ySas
3rd 59 bearstearns
4th 59 kounoike
5th 60 robin
6th 61 arpad
7th 61 shinh
8th 62 0xF
9th 66 ersagun
10th 66 redneval
11th 66 o0lit3
12th 66 Aidy
13th 67 szeryf
14th 70 ott
15th 73 jojo
16th 73 yvl
17th 73 acura
18th 77 Jasper
19th 78 agenticarus
20th 79 tybalt89
21st 82 yojeb
22nd 82 grizzley
23rd 85 twice11
24th 86 yanick-walloper
25th 87 Ciaran
26th 87 olivier
27th 88 justin
28th 91 SubStack
29th 93 wendelscardua
30th 94 Trinary
31st 95 tripa
32nd 95 dropbear
33rd 95 sprimmer
34th 98 yanick
35th 99 chargrill
36th 99 kjan
37th 99 Jocelyn
38th 100 k12u
39th 102 duranain
40th 104 sphx95
</CODE>
</P>
<P>
<B>Ruby</B> (86 entries):
<CODE>
1st 53 eyepopslikeamosquito
2nd 55 primo
3rd 56 flagitious
4th 58 leonid
5th 59 Mark Byers
6th 71 shinh
7th 71 tryeng
8th 73 yvl
9th 73 bitsweat
10th 76 ozy4dm
</CODE>
</P>
<P>
<B>Python</B> (87 entries):
<CODE>
1st 78 Mark Byers
2nd 78 eyepopslikeamosquito
3rd 79 tryeng
4th 79 hallvabo
5th 80 tha
6th 80 primo
7th 81 BjarkeEbert
8th 85 hiro.suzuki
9th 91 mick
10th 91 gtalpo
</CODE>
</P>
<P>
<B>PHP</B> (62 entries):
<CODE>
1st 70 eyepopslikeamosquito
2nd 89 hiro.suzuki
3rd 89 Methedrine
4th 89 W
5th 89 Trinary
6th 90 arpad
7th 90 angpoo
8th 91 rollercoaster375
9th 92 El Hombre Gris
10th 92 morten
</CODE>
</P>
<P><B>Leaderboard Update</B></P>
<P>
One month later, the leaderboard changes are:
<CODE>
hallvabo Python 79 -> 72
hendrik Python 94 -> 90
d3m3vilurr Python 108 -> 101
yvl Perl 73 -> 61
falsetru Python 216 -> 95
Leinad Perl -> 152
eyepopslikeamosquito Python 78 -> 72
eyepopslikeamosquito PHP 70 -> 68
Kalindor PHP -> 93
tomatoring Python -> 295
</CODE>
</P>
<P>
Update: shortly before the codegolf web site shut down in 2014, final leaderboard improvements were:
<CODE>
eyepopslikeamosquito Perl 55 -> 53 (see [id://853502])
eyepopslikeamosquito Python 72 -> 71 (see [id://1083046])
eyepopslikeamosquito PHP 68 -> 63 (see [id://836741])
</CODE>
</P>
<P><B>References</B></P>
<P>
<ul>
<li> [id://759963]
<li> [id://762180]
<li> [id://763105]
<li> [id://811919]
<li> [id://814900]
<li><a href="http://codegolf.com/">Golf competitions in Perl, Ruby, Python or PHP</a>
<li> [id://600665]
<li> <a href="http://www.therubygame.com/challenges/5/submissions?order=shortest">therubygame: Roman numerals. What are they good IV?</a>
<li> <a href="https://gist.github.com/1712932">therubygame deconstruct by czetter</a>
<li> [id://1083046]
</ul>
</P>
</readmore>