Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Every now and then, someone asked about Perl and math. Sometimes some Perl users asked how they could use Perl to do more math stuff, or folks from other fields asked how Perl might complement their math work. Here we'll represent a few handy references, comparisons and illustrations on a few math modules and software useful to Perl. It serves more as a brief overview of sample of math  modules and software, nothing comprehensive or exhaustive. One glaring omission is bioperl and bioinformatics, which is too diverse and too huge a field to be considered as just "math."

Normally, if you search CPAN for Perl math modules. You probably should look under Math::* or Statistics::*, along with AI::*, Algorithm::*, Crypt::*, Date::*, Graph::*, GraphViz::*, Inline::*, etc. The GNU Project is also a good place to look for external software useable by Perl.

Math Modules & Software

Here is a sample of modules and software useful to do math with Perl.

Module/Software Type Availability Remarks
Inline::C General Programming. Gateway to other C stuff Any OS Probably not your first choice to do math in Perl
Inline::Octave & Octave Matrix algebra and numerical analysis Inline::Octave not avail. from ActiveState. Octave avail. for any OS Open source version of Matlab. Fast. Simple syntax. Octave can be used interactively or as a scripting language. Other commercial counterparts: Gauss, APL, etc.
Math::Cephes General engineering math library Any OS Eigensystem only for real symmetric matrices
Math::LP Linear programming Not avail. for ActiveState. DOS version LP Solve avail. (somewhere) Free. Not sure if LP Solve itself is still actively maintained. Commercial counterparts: CPLEX, Lindo, Minos, AMPL, etc.
Math::Pari Number theory Any OS Good for cryptographic analysis
Math::MatrixReal Matrix algebra (for real values only) Any OS Pure Perl implementation. Eigensystem only for real symmetric matrices
PDL Perl Data Language 2.4.0 latest. 2.3.1 binary for Windows Gives Perl more math-oriented syntax
R Statistical software Any OS Used interactively or as a scripting language. Very versatile for plotting (you can see sample plots in this report). Latest version 1.7.1. Open source version of Splus. Bi-directional interface for *nix & COM Server interface for Windows (for R v.1.7.0) avail. from Omegahat. Other commerical counterparts: SPSS, SAS, etc.

Syntax Comparison

It may be informative to compare the mathematical syntax between modules and software. The following compares the syntax to multiple a 2-by-2 matrix with a vector.

Module/Software Code
Math::Cephes
use Math::Cephes::Matrix qw(mat); $a = mat [[1, 2], [3, 4]]; $b = [1, 1]; print "@{$a->mul($b)}"; # print "3 7"
Math::RealMatrix
use Math::MatrixReal; $a = Math::MatrixReal->new_from_rows( [[1, 2], [3, 4]] ); $b = Math::MatrixReal->new_from_cols( [ [1, 1] ] ); print $a*$b; # print "[ 3.000000000000E+000 ] # [ 7.000000000000E+000 ]"
PDL
use PDL; $a = pdl [[1, 2],[3, 4]]; $b = pdl [[1], [1]]; print $a x $b; # print "[ # [3] # [7] # ]"
Octave
>> a = [1, 2; 3, 4]; >> b = [1; 1]; >> a*b ans = 3 7
R
> a = matrix(c(1,2,3,4), ncol=2, byrow=T) > b = matrix(c(1,1), ncol=1) > a%*%b [,1] [1,] 3 [2,] 7

On the math alone, the syntax of all the modules and software above seems fairly clean. Octave and R have much easier to read overall language syntax compared to Perl, however.

A distinguished feature of many math-oriented languages are their "vectorized operation" and "subscripting" capability. Take extracting a sub matrix as an example:

Module/Software Code
Math::Cephes
use Math::Cephes::Matrix qw(mat); $mat = mat [[1, 2, 3], [4, 5, 6], [7, 8, 9]]; $mat = $mat->coef; for my $i (1..2) { # 0 first index print "@{$mat->[$i]}[1..2]\n"; } # print "5 6 # 8 9"
Math::RealMatrix
use Math::MatrixReal; $mat = Math::MatrixReal->new_from_rows( [[1, 2, 3], [4, 5, 6], [7, 8, 9]] ); for my $i (2..3) { # 1 first index for my $j (2..3) { print $mat->element($i, $j), " "; } print "\n"; } # print "5 6 # 8 9"
PDL
use PDL; $mat = pdl [[1, 2, 3], [4, 5, 6], [7, 8, 9]]; print $mat->slice("1:2,1:2"); # 0 first index # print "[ # [5 6] # [8 9] # ]"
Octave
mat = [1,2,3; 4,5,6; 7,8,9]; mat(2:3, 2:3) ans = 5 6 8 9
R
> mat = matrix(1:9, ncol=3, byrow=T) > mat[2:3, 2:3] [,1] [,2] [1,] 5 6 [2,] 8 9

Vectorized operation is a very convenient feature. Consider the following R's code:

> vec = 1:10 > vec [1] 1 2 3 4 5 6 7 8 9 10 > vec %% 2 [1] 1 0 1 0 1 0 1 0 1 0 > vec[vec %% 2 == 1] [1] 1 3 5 7 9 > vec[vec %% 2 == 1] + 1 [1] 2 4 6 8 10

Since Pari was mentioned and it's somewhat a class of its own, here's an example of Pari in its own context:

#! /usr/local/bin/perl -w use strict; # ----------------------------------------------------------- # RSA algorithm -- assymetrical\public-key cryptography # ----------------------------------------------------------- use Math::Pari qw(gcd PARI) ; # ----------------------------------------------------------- # m -- message my $m = 'Perl' ; print "original: $m\n" ; my $tmpl = 'C*' ; my @m = unpack($tmpl, $m) ; # string -> unsigned char values print "coded: @m\n" ; # n = pq -- in RSA, p & q = prime, each 1024 bits/308 digits long my $p = PARI("prime(".int(rand 50).")") ; my $q = PARI("prime(".int(rand 50).")") ; my $n = $p*$q ; # $n = Pari's obj # choose a random number r, s.t. # 1 < r < (p-1)(q-1) = b # gcd(r, b) = 1 -- relative prime my $b = ($p-1)*($q-1) ; my $r ; do {$r = int rand $b ; } until (gcd($r,$b) == 1) ; $r = PARI $r ; # rk = 1 mod (p-1)(q-1) -- k = private key; (n, r) public my $k = (1/$r)%$b ; # Pari's math operators, since vars = Pari # encrypt -- c = (m ^ r) mod n my @c ; map { $c[$_] = ($m[$_]**$r)%$n } 0..$#m ; # Perl: ** for power print "ciphered: @c\n" ; # decrypt -- m = (c ^ k) mod n my @d ; map { $d[$_] = PARI("($c[$_]^$k)%$n") } 0..$#c ; # Pari: ^ for power print "deciphered: @d\n" ; print "decoded: " . pack($tmpl, @d) . "\n" ; __END__ original: Perl coded: 80 101 114 108 ciphered: 18431 6512 5843 7236 deciphered: 80 101 114 108 decoded: Perl

Sometimes if somehow you can't or don't have direct mechanism to interface between Perl and your external math software but your software can be operated through command line, you can always try interfacing through command line. Here's an example for Perl and R.

#! /usr/local/bin/perl -w use strict ; R("getwd()"); sub R { my $Rpath = "C:\\R\\rw\\bin\\" ; my $Rcmd = $Rpath . "rterm --vanilla --quiet --slave" ; my $Rscript = shift ; $Rscript =~ s/(\r|;\r)/ ;/gm ; $Rscript =~ s/<-/=/gm ; # \r or <- will break "echo" return `echo $Rscript | $Rcmd` ; }

Or if you only have the DOS LP Solve executable, you can do:

my $dir = 'D:\\tmp\\prog\\math\\lp32'; my $lp_solve = "d:\\mydir\\lp_solve.exe"; my $lpfile = "d:\\mydir\\model.lp"; (my $lp = << " EOF") =~ s/^\s+//gm; min: 8 x1 + 15 x2 ; c1: 10 x1 + 21 x2 > 156 ; c2: 2 x1 + x2 > 22 ; EOF open OUTFILE, "+>$lpfile"; print OUTFILE $lp; close OUTFILE; $output = qx($lp_solve < $lpfile); $output =~ s/\r//gm; print $lp; print $output; system("del $lpfile");

That's why Perl is called glue language.

Benchmarks

When comes to math, people often care about speed. So, let's benchmark.

use strict; use warnings; my ($a1,$a2,$a3,$ref); for my $x (0..20) { for my $y (0..20) { $ref->[$x][$y] = rand 100; } } use Math::Cephes::Matrix qw(mat); $a1 = mat $ref; use Math::MatrixReal; $a2 = Math::MatrixReal->new_from_rows( $ref ); use PDL; use PDL::Slatec; $a3 = pdl $ref; use Benchmark qw(cmpthese); cmpthese(100, { cephes=>sub{$a1->inv()}, matrixreal=>sub{$a2->inverse()}, pdl=>sub{matinv($a3)} } ); __END__ Rate matrixreal cephes pdl matrixreal 5.28/s -- -94% -99% cephes 83.3/s 1479% -- -87% pdl 625/s 11744% 650% --

Notice for any math modules using compiled code, your mileage can be greatly affected by how you compile your module.

Here you can find benchmark for Octave and R along with other math packages.

Sample Usage Scenarios

In case you're not sure how Perl and other math modules and software are supposed to be used together in a project, here are a couple of generic illustrations (not rules).

  Perl Math Module / Software Remarks
Data Storage

N/A

N/A

Many people store their data in spreadsheet. Better way will be to store data in database, such as Oracle, MySQL, MS SQL, etc. This part doesn't have much to do with Perl and the math module and software per se.

Data Access YES Maybe  This part means "insert," "update," and "delete" data. You will probably be using DBI for that. If your math software can do it directly and you don't need to do much Data Manipulation stuff below, you can possibly leave Perl out of the picture here.
Data Manipulation YES Maybe  Data manipulation means getting the raw data into the right format for Data Analysis, such as extracting a series of dates from a log file, turning some poorly formatted text file into a CSV file, etc. When the task gets tricky, Perl is often a better choice than doing it in a spreadsheet or a math software. That also means you will probably be using a lot of Regular Expressions.
Data Analysis Maybe YES  Perl doesn't have the speed compared to C or clean syntax compared to Octave or extensive math modules to do heavy duty math. Data analysis is often best handled by external specialized software. But of course, if you use Perl for everything else and the math stuff is relatively simple, "outsourcing" the math work could be overkill and counterproductive.
Reporting YES Maybe  Some database vendor (such as Oracle) or math/statistical software (SAS, for instance) has "report generator." If such a report generator can readily provide what you need, no reasons why you shouldn't use it. But when heavy twisting is needed to get what you want out of a report generator, Perl may be a viable alternative. 

Often you can use Perl to create a report in HTML (which is usually easier than, say, PDF) and convert the HTML to whatever other format you want (such as Word or PDF).

Some people may ask what about XML. Well, XML is for data representation, not visual representation. In terms of development and design process, XML is probably a consideration closer to Data Storage than Reporting, i.e. something you should consider and design much earlier on before you think about reporting.

An increasing common scenario is where people fetch their data (in form of XML, let's say) over the Internet or some network and then analyze the data upon download (stock market or DNA data, for example).

  Perl Math Module / Software What may be used
(merely listing possibilities, not recommendation)
XML Fetch YES Maybe

 LWP, XML::libXML

Data Manipulation YES Maybe  Regular expressions
Data Analysis Maybe YES  Modules/software aforementioned
Reporting YES Maybe &nbsp;Text::Template, HTML::Template, XML::libXSLT

That's all for now. Hope it helps.


__________________
Update: Additional example added to Syntax Comparison.


In reply to Perl & Math: A Quick Reference by chunlou

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-03-29 13:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found