Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Any perlish hints about Kaplan-Meier Estimator?

by bliako (Monsignor)
on Nov 04, 2021 at 10:03 UTC ( [id://11138415]=note: print w/replies, xml ) Need Help??


in reply to Any perlish hints about Kaplan-Meier Estimator?

I agree with Fletch:

This (math heavy statistics) is another of those heavy math areas where like the PDE question recently aren't really in perl's wheelhouse as far as off the shelf solutions go. 

If this thing is usually done in R (and there is indeed an R package for this sort of thing), leave it as it is and create a Perl wrapper around it. In the past, I tried to re-implement algorithms existing in R, in Perl and soon hit a dead end. My experience was that one can implement one algorithm fairly easily in Perl but then what? You need to combine it with other metrics and algorithms. You will need to do some significance tests to verify your results. Very importantly: you will also need to present results graphically. R is very good at all of these and also most of these are implemented in C or Fortran for speed. The graphical presentations package ggplot2 is one of the best there are all around today. But it's like a lotus flower in a snake pit -- the R programming environment.

What I ended up doing was to create bash-wrappers (I can't remember why not Perl-wrappers using Statistics::R, I would definetely do Perl-wrappers today but I have not tried Statistics::R extensively. ) like unix-like utilities to enclose procedures that I needed at the time. Which provided a standard CLI and created an R script (which would be quite complex in that it combined several of these algorithms in R, exchanging R-data-structures between them -- the latter is so important) on the fly (possibly from a template) to do what I wanted and produce some results: plots, csv files, html tables.

In doing the above I found the best of both worlds. Although I would love to have the wealth of CRAN in Perl, it's not "shameful" to use the right tool for the right job. And always think big in that the project may grow well above the language limits.

15min Edit: added bonus working in R is that there is quite good and easy framework to parallelise things. All my scripts have the -p num-threads CLI option.

bw, bliako

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11138415]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-24 00:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found