Every now and then, someone asked about Perl and math. Sometimes some Perl
users asked how they could use Perl to do more math stuff, or folks from other
fields asked how Perl might complement their math work. Here we'll represent
a few handy references, comparisons and illustrations on a few math modules and
software useful to Perl. It serves more as a brief overview of sample of
math modules and software, nothing comprehensive or exhaustive. One glaring
omission is bioperl and bioinformatics, which is too diverse and too huge a
field to be considered as just "math."
Normally, if you search CPAN for Perl math
modules. You probably should look under Math::* or Statistics::*, along with AI::*,
Algorithm::*, Crypt::*, Date::*,
Graph::*, GraphViz::*, Inline::*, etc. The GNU
Project is also a good place to look for external software useable by Perl.
Math Modules & Software
Here is a sample of modules and software useful to do math with Perl.
Module/Software |
Type |
Availability |
Remarks |
Inline::C |
General Programming. Gateway to other C stuff |
Any OS |
Probably not your first choice to do math in Perl |
Inline::Octave & Octave |
Matrix algebra and numerical analysis |
Inline::Octave not avail. from ActiveState. Octave avail. for any OS |
Open source version of Matlab. Fast. Simple syntax.
Octave can be used interactively or as a scripting language. Other commercial
counterparts: Gauss, APL, etc. |
Math::Cephes |
General engineering math library |
Any OS |
Eigensystem only for real symmetric matrices |
Math::LP |
Linear programming |
Not avail. for ActiveState. DOS version LP Solve avail. (somewhere) |
Free. Not sure if LP Solve itself is still actively
maintained. Commercial counterparts: CPLEX, Lindo, Minos, AMPL, etc. |
Math::Pari |
Number theory |
Any OS |
Good for cryptographic analysis |
Math::MatrixReal |
Matrix algebra (for real values only) |
Any OS |
Pure Perl implementation. Eigensystem only for real symmetric matrices |
PDL |
Perl Data Language |
2.4.0 latest. 2.3.1 binary for Windows |
Gives Perl more math-oriented syntax |
R |
Statistical software |
Any OS |
Used interactively or as a scripting language. Very versatile for
plotting (you can see sample plots in this report). Latest version 1.7.1. Open source version of Splus. Bi-directional interface for
*nix & COM Server interface for Windows (for R v.1.7.0) avail. from Omegahat.
Other commerical counterparts: SPSS, SAS, etc. |
Syntax Comparison
It may be informative to compare the mathematical syntax between modules and
software. The following compares the syntax to multiple a 2-by-2 matrix with a
vector.
Module/Software |
Code |
Math::Cephes |
use Math::Cephes::Matrix qw(mat);
$a = mat [[1, 2], [3, 4]];
$b = [1, 1];
print "@{$a->mul($b)}";
# print "3 7"
|
Math::RealMatrix |
use Math::MatrixReal;
$a = Math::MatrixReal->new_from_rows( [[1, 2], [3, 4]] );
$b = Math::MatrixReal->new_from_cols( [ [1, 1] ] );
print $a*$b;
# print "[ 3.000000000000E+000 ]
# [ 7.000000000000E+000 ]"
|
PDL |
use PDL;
$a = pdl [[1, 2],[3, 4]];
$b = pdl [[1], [1]];
print $a x $b;
# print "[
# [3]
# [7]
# ]"
|
Octave |
>> a = [1, 2; 3, 4];
>> b = [1; 1];
>> a*b
ans =
3
7
|
R |
> a = matrix(c(1,2,3,4), ncol=2, byrow=T)
> b = matrix(c(1,1), ncol=1)
> a%*%b
[,1]
[1,] 3
[2,] 7
|
On the math alone, the syntax of all the modules and software above seems
fairly clean. Octave and R have much easier to read overall language syntax
compared to Perl, however.
A distinguished feature of many math-oriented languages are their "vectorized
operation" and "subscripting" capability. Take extracting a sub
matrix as an example:
Module/Software |
Code |
Math::Cephes |
use Math::Cephes::Matrix qw(mat);
$mat = mat [[1, 2, 3], [4, 5, 6], [7, 8, 9]];
$mat = $mat->coef;
for my $i (1..2) { # 0 first index
print "@{$mat->[$i]}[1..2]\n";
}
# print "5 6
# 8 9"
|
Math::RealMatrix |
use Math::MatrixReal;
$mat = Math::MatrixReal->new_from_rows(
[[1, 2, 3], [4, 5, 6], [7, 8, 9]] );
for my $i (2..3) { # 1 first index
for my $j (2..3) {
print $mat->element($i, $j), " ";
}
print "\n";
}
# print "5 6
# 8 9"
|
PDL |
use PDL;
$mat = pdl [[1, 2, 3], [4, 5, 6], [7, 8, 9]];
print $mat->slice("1:2,1:2"); # 0 first index
# print "[
# [5 6]
# [8 9]
# ]"
|
Octave |
mat = [1,2,3; 4,5,6; 7,8,9];
mat(2:3, 2:3)
ans =
5 6
8 9
|
R |
> mat = matrix(1:9, ncol=3, byrow=T)
> mat[2:3, 2:3]
[,1] [,2]
[1,] 5 6
[2,] 8 9
|
Vectorized operation is a very convenient feature. Consider the following R's code:
> vec = 1:10
> vec
[1] 1 2 3 4 5 6 7 8 9 10
> vec %% 2
[1] 1 0 1 0 1 0 1 0 1 0
> vec[vec %% 2 == 1]
[1] 1 3 5 7 9
> vec[vec %% 2 == 1] + 1
[1] 2 4 6 8 10
Since Pari was mentioned and it's somewhat a class of its own, here's an example of Pari in its own context:
#! /usr/local/bin/perl -w
use strict;
# -----------------------------------------------------------
# RSA algorithm -- assymetrical\public-key cryptography
# -----------------------------------------------------------
use Math::Pari qw(gcd PARI) ;
# -----------------------------------------------------------
# m -- message
my $m = 'Perl' ;
print "original: $m\n" ;
my $tmpl = 'C*' ;
my @m = unpack($tmpl, $m) ; # string -> unsigned char values
print "coded: @m\n" ;
# n = pq -- in RSA, p & q = prime, each 1024 bits/308 digits long
my $p = PARI("prime(".int(rand 50).")") ;
my $q = PARI("prime(".int(rand 50).")") ;
my $n = $p*$q ; # $n = Pari's obj
# choose a random number r, s.t.
# 1 < r < (p-1)(q-1) = b
# gcd(r, b) = 1 -- relative prime
my $b = ($p-1)*($q-1) ;
my $r ;
do {$r = int rand $b ; } until (gcd($r,$b) == 1) ;
$r = PARI $r ;
# rk = 1 mod (p-1)(q-1) -- k = private key; (n, r) public
my $k = (1/$r)%$b ; # Pari's math operators, since vars = Pari
# encrypt -- c = (m ^ r) mod n
my @c ;
map { $c[$_] = ($m[$_]**$r)%$n } 0..$#m ; # Perl: ** for power
print "ciphered: @c\n" ;
# decrypt -- m = (c ^ k) mod n
my @d ;
map { $d[$_] = PARI("($c[$_]^$k)%$n") } 0..$#c ; # Pari: ^ for power
print "deciphered: @d\n" ;
print "decoded: " . pack($tmpl, @d) . "\n" ;
__END__
original: Perl
coded: 80 101 114 108
ciphered: 18431 6512 5843 7236
deciphered: 80 101 114 108
decoded: Perl
Sometimes if somehow you can't or don't have direct mechanism to interface
between Perl and your external math software but your software can be operated
through command line, you can always try interfacing through command line.
Here's an example for Perl and R.
#! /usr/local/bin/perl -w
use strict ;
R("getwd()");
sub R {
my $Rpath = "C:\\R\\rw\\bin\\" ;
my $Rcmd = $Rpath . "rterm --vanilla --quiet --slave" ;
my $Rscript = shift ;
$Rscript =~ s/(\r|;\r)/ ;/gm ;
$Rscript =~ s/<-/=/gm ; # \r or <- will break "echo"
return `echo $Rscript | $Rcmd` ;
}
Or if you only have the DOS LP Solve executable, you can do:
my $dir = 'D:\\tmp\\prog\\math\\lp32';
my $lp_solve = "d:\\mydir\\lp_solve.exe";
my $lpfile = "d:\\mydir\\model.lp";
(my $lp = << " EOF") =~ s/^\s+//gm;
min: 8 x1 + 15 x2 ;
c1: 10 x1 + 21 x2 > 156 ;
c2: 2 x1 + x2 > 22 ;
EOF
open OUTFILE, "+>$lpfile";
print OUTFILE $lp;
close OUTFILE;
$output = qx($lp_solve < $lpfile);
$output =~ s/\r//gm;
print $lp;
print $output;
system("del $lpfile");
That's why Perl is called glue language.
Benchmarks
When comes to math, people often care about speed. So, let's benchmark.
use strict;
use warnings;
my ($a1,$a2,$a3,$ref);
for my $x (0..20) {
for my $y (0..20) {
$ref->[$x][$y] = rand 100;
}
}
use Math::Cephes::Matrix qw(mat);
$a1 = mat $ref;
use Math::MatrixReal;
$a2 = Math::MatrixReal->new_from_rows( $ref );
use PDL;
use PDL::Slatec;
$a3 = pdl $ref;
use Benchmark qw(cmpthese);
cmpthese(100,
{
cephes=>sub{$a1->inv()},
matrixreal=>sub{$a2->inverse()},
pdl=>sub{matinv($a3)}
}
);
__END__
Rate matrixreal cephes pdl
matrixreal 5.28/s -- -94% -99%
cephes 83.3/s 1479% -- -87%
pdl 625/s 11744% 650% --
Notice for any math modules using compiled code, your mileage can be greatly
affected by how you compile your module.
Here you can find benchmark
for Octave and R along with other math packages.
Sample Usage Scenarios
In case you're not sure how Perl and other math modules and software are
supposed to be used together in a project, here are a couple of generic
illustrations (not rules).
|
Perl |
Math Module / Software |
Remarks |
Data Storage |
N/A |
N/A |
Many people store their data in
spreadsheet. Better way will be to store data in database, such as Oracle,
MySQL, MS SQL, etc. This part doesn't have much to do with Perl and the
math module and software per se. |
Data Access |
YES |
Maybe |
This part means "insert,"
"update," and "delete" data. You will probably be
using DBI for that. If your math software can do it directly and
you don't need to do much Data Manipulation stuff below, you can possibly
leave Perl out of the picture here. |
Data Manipulation |
YES |
Maybe |
Data manipulation means getting the
raw data into the right format for Data Analysis, such as extracting a
series of dates from a log file, turning some poorly formatted text file
into a CSV file, etc. When the task gets tricky, Perl is often a better
choice than doing it in a spreadsheet or a math software. That also means
you will probably be using a lot of Regular Expressions. |
Data Analysis |
Maybe |
YES |
Perl doesn't have the speed compared
to C or clean syntax compared to Octave or extensive math modules to do
heavy duty math. Data analysis is often best handled by external
specialized software. But of course, if you use Perl for everything else
and the math stuff is relatively simple, "outsourcing" the math
work could be overkill and counterproductive. |
Reporting |
YES |
Maybe |
Some database vendor (such as Oracle)
or math/statistical software (SAS, for instance) has "report
generator." If such a report generator can readily provide what you
need, no reasons why you shouldn't use it. But when heavy twisting is
needed to get what you want out of a report generator, Perl may be a
viable alternative.
Often you can use Perl to create a report in HTML (which
is usually easier than, say, PDF) and convert the HTML to whatever other
format you want (such as Word or PDF).
Some people may ask what about XML. Well, XML is for
data representation, not visual representation. In terms of development
and design process, XML is probably a consideration closer to Data Storage
than Reporting, i.e. something you should consider and design much earlier
on before you think about reporting.
|
An increasing common scenario is where people fetch their data (in form of
XML, let's say) over the Internet or some network and then analyze the data upon
download (stock market or DNA data, for example).
|
Perl |
Math Module / Software |
What may be used (merely
listing possibilities, not recommendation) |
XML Fetch |
YES |
Maybe |
LWP, XML::libXML |
Data Manipulation |
YES |
Maybe |
Regular expressions |
Data Analysis |
Maybe |
YES |
Modules/software aforementioned |
Reporting |
YES |
Maybe |
Text::Template, HTML::Template, XML::libXSLT |
That's all for now. Hope it helps.
__________________
Update: Additional example added to Syntax Comparison.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|
|