After my recent "discovery" of the PGA-TRAM algorithm, I couldn't resist coding it up in a variety of languages. I'm doing this as a rosetta code node because I find it fun; hopefully, it'll provide some interesting insights into a variety of programming languages. Please note that this meditation is not about golf, but about implementing an arbitrary, specific algorithm in the most "natural" way in a variety of languages.
Here's the spec:
Write a function, roman_to_dec, to convert a "modern" Roman Numeral to its decimal equivalent. The function takes a single string argument and returns an integer. The string argument may be assumed to always contain a valid Roman Numeral in the range 1-3999. Error handling is not required. For example, roman_to_dec("XLII") should return 42.
Perl
Let's get started with a Perl version:
use strict; use warnings; use List::Util qw(reduce); { my %rtoa = ( M=>1000, D=>500, C=>100, L=>50, X=>10, V=>5, I=>1 ); sub roman_to_dec { reduce { $a+$b-$a%$b*2 } map { $rtoa{$_} } split//, uc(shift) } } my @testdata = ( "XLII", "LXIX", "mi" ); for my $r (@testdata) { print "$r: ", roman_to_dec($r), "\n"; }
Running this program produces:
XLII: 42 LXIX: 69 mi: 1001
To me, this is the most natural way to express the PGA-TRAM algorithm in Perl. It uses a variety of common Perl idioms:
- A hash (rtoa) for the lookup table.
- Lexical scope to "data hide" the rtoa hash (update: Persistent Private Variables using state are a better way to achieve this for Perl 5.10+, as demonstrated in the replies from moritz and BrowserUk).
- A Perlish functional "pipeline": uc -> split -> map -> reduce.
For those folks unfamiliar or uncomfortable with reduce, note that it can be eliminated like this:
Though I don't mind that at all, I do prefer the reduce version.sub roman_to_dec { my $t = 0; for my $n ( map { $rtoa{$_} } split//, uc(shift) ) { $t += $n-$t%$n*2; } return $t; }
Python
As you might expect, my "most natural" Python 2 solution looks very similar to the Perl one above:
That this solution is essentially identical to the Perl one shows how similar these two languages are. Indeed, I'm fond of claiming that Perl, Python and Ruby are "essentially equivalent". :) Since Python lacks Perl's simple lexical scoping mechanism, I chose to "data hide" the rtoa hash by making it a default function argument. Note that Python default function arguments are initialized once only. Though I could have employed the Python map function (with a lambda), I chose instead to use a lazily evaluated Python generator list comprehension, (__rtoa[c] for c in r.upper()), because that seemed more "naturally Pythonic" to me.def roman_to_dec(r, __rtoa = dict(M=1000, D=500, C=100, L=50, X=10, V= +5, I=1) ): return reduce( lambda t,n: t+n-t%n*2, (__rtoa[c] for c in r.upper()) + ) testdata = [ "XLII", "LXIX", "mi" ] for r in testdata: print r, ":", roman_to_dec(r)
Update: a Python 3 version (just had to import reduce and use parens with print):
from functools import reduce def roman_to_dec(r, __rtoa = dict(M=1000, D=500, C=100, L=50, X=10, V= +5, I=1) ): return reduce( lambda t,n: t+n-t%n*2, (__rtoa[c] for c in r.upper()) + ) testdata = [ "XLII", "LXIX", "mi" ] for r in testdata: print( r, ":", roman_to_dec(r) )
Plus some much later (2020) discussion of Perl/Python solutions: Re^5: A short whishlist of Perl5 improvements leaping to Perl7
Haskell
Here's my Haskell solution, using GHC:
Instead of using a rtoa hash, natural in Perl and Python, a rtoa function using simple pattern matching seemed the most Haskelly way of expressing this algorithm.{-# OPTIONS_GHC -fglasgow-exts -Wall #-} import Data.Char (toUpper) import Data.List (concat, intersperse) rtoa :: Char -> Int rtoa 'M' = 1000 rtoa 'D' = 500 rtoa 'C' = 100 rtoa 'L' = 50 rtoa 'X' = 10 rtoa 'V' = 5 rtoa 'I' = 1 rtoa r = error $ "Invalid rtoa char:" ++ show r urtoa :: Char -> Int urtoa = rtoa . toUpper roman_to_dec :: String -> Int roman_to_dec = foldl1 (\t n -> t+n-t`mod`n*2) . map urtoa myshow :: (String -> Int) -> String -> String myshow fn val = val ++ ": " ++ (show (fn val)) testdata :: [String] testdata = [ "XLII", "LXIX", "mi" ] main :: IO () main = do putStrLn $ (concat $ intersperse "\n" (map (\c -> (myshow roman_to_d +ec c)) testdata))
C++
Finally, my ANSI C++ solution:
The Perl reduce function is called "accumulate" in ANSI C++. Implementing the lookup with the admittedly long-winded romtab[] table felt like natural C++ to me because it seeks high performance. Indeed, if you peek inside <ctype.h> you will likely see toupper() implemented in a similar vein. This makes the rtoa(toupper(c)) above ridiculously fast, costing only two (constant) array lookups, not requiring even a single C function call! Also of note in this solution is that I didn't bother with map, instead simply applying the ultra fast rtoa(toupper(c)) combo twice in the accumulator function.#include <cctype> #include <iostream> #include <string> #include <vector> #include <numeric> using namespace std; const int romtab[] = { 0,0,0,0,0,0, 0, 0, 0,0, // 00-09 0,0,0,0,0,0, 0, 0, 0,0, // 10-19 0,0,0,0,0,0, 0, 0, 0,0, // 20-29 0,0,0,0,0,0, 0, 0, 0,0, // 30-39 0,0,0,0,0,0, 0, 0, 0,0, // 40-49 0,0,0,0,0,0, 0, 0, 0,0, // 50-59 0,0,0,0,0,0, 0, 100,500,0, // 60-69 (C:67,D:68) 0,0,0,1,0,0,50,1000, 0,0, // 70-79 (I:73,L:76,M:77) 0,0,0,0,0,0, 5, 0, 10,0 // 80-89 (V:86,X:88) }; inline int rtoa(int c) { return romtab[c]; } inline int accfn(int t, char c) { return t+rtoa(toupper(c))-t%rtoa(toupper(c))*2; } int roman_to_dec(const string& s) { return accumulate(s.begin(), s.end(), 0, accfn); } int main(int argc, char* argv[]) { vector<string> testdata; testdata.push_back("XLII"); testdata.push_back("LXIX"); testdata.push_back("mi"); for (vector<string>::const_iterator iter = testdata.begin(); iter != testdata.end(); ++iter) { cout << *iter << ": " << roman_to_dec(*iter) << '\n'; } return 0; }
Update: As indicated below, we can eliminate the call toupper(), while eliminating any memory faults on invalid input, by adjusting romtab as shown below:
// Note: there are less than 256 initializers in this table; // the uninitialized ones are guaranteed by ANSI C to be // initialized to zero. const int romtab[256] = { 0,0,0,0,0,0, 0, 0, 0, 0, // 00- 09 0,0,0,0,0,0, 0, 0, 0, 0, // 10- 19 0,0,0,0,0,0, 0, 0, 0, 0, // 20- 29 0,0,0,0,0,0, 0, 0, 0, 0, // 30- 39 0,0,0,0,0,0, 0, 0, 0, 0, // 40- 49 0,0,0,0,0,0, 0, 0, 0, 0, // 50- 59 0,0,0,0,0,0, 0, 100, 500, 0, // 60- 69 0,0,0,1,0,0, 50,1000, 0, 0, // 70- 79 0,0,0,0,0,0, 5, 0, 10, 0, // 80- 89 0,0,0,0,0,0, 0, 0, 0, 100, // 90- 99 500,0,0,0,0,1, 0, 0, 50,1000, // 100-109 0,0,0,0,0,0, 0, 0, 5, 0, // 110-119 10,0,0,0,0,0, 0, 0, 0, 0 // 120-129 }; // Return the arabic number for a roman letter c. // Return zero if the roman letter c is invalid. inline int urtoa(int c) { return romtab[c]; } inline int accfn(int t, char c) { return t+urtoa(c)-t%urtoa(c)*2; }
I'd love to see some sample implementations in other languages, especially Perl 6. So please feel free to respond away. :)
Other Rosetta Code Nodes
- Five Ways to Reverse a String of Words (C#, Perl 5, Perl 6, Ruby, Haskell)
- Yet Another Rosetta Code Problem (Perl, Ruby, Python, Haskell, ...)
See also