|laziness, impatience, and hubris|
Re^3: [OT] Python vs Rby sundialsvc4 (Abbot)
|on May 31, 2017 at 17:03 UTC||Need Help??|
“No, it isn’t,” he said, with a very-patient smile.
Python and R are two entirely different tools, meant for entirely different purposes. Python is a general-purpose programming language. R is a special-purpose language designed expressly for statistics work. R is most-commonly found being invoked within the context of traditional statistics packages such as SAS or SPSS, but it sometimes also is used to create entire statistical workflows.
On the one hand, R allows the statistical researcher to construct programs, instead of being limited [only ...] to the predefined, options-laden “bricks” of a traditional stats engine. (This is something that statisticians have ordinarily not been able to do.) On the other hand, the capacity of R is limited. Most of its processing is done in-memory, whereas most “bricks” use files (“data sets”) for input and output. So, you have to use other means to “boil down” the mass of data before feeding a slice of it to R. But the programmability of R can then save you many, many otherwise-cumbersome steps. (As one colleague quipped, “This is a helluva lot better than Another Brick in the Wall.”)
One common way to integrate the two types of tools, as I have noted, is to use R to “create a brick.” The R source-code appears in-line with the rest of it, or in an included file.
Python might take the place of the “bricks” of a traditional package, because it works much more easily with files and has the raw-capacity that R (by design) does not.. Python also has a powerful implementation of LISP-style lists. The two tools are therefore complementary, and are often used together.
If R is not yet in your field of experience-languages ... and it may well not be ... then you ought to be able to find a researcher in your company that feels the opposite. The two of you should then team-up so that each of you is working in your respective field of expertise.
- - -
Now, may I gently remind you all ... you can go ahead and pile-on a hundred thousand down-votes if that’s how you get your jollies, but ... in this case, I do absolutely know what I am talking about, because I have done this. I served a group of statistical researchers, for the better part of a year, who were doing precisely this task, in fact using precisely these tools: SPSS, R, and Python. They were the R experts (but I learn fast ...), and I filled in the gaps to build-out and then document the total workflow process. Everything that I have said here, therefore, comes from personal professional experience. The data volumes that we were dealing with were 21 major datasets with about 13 million records each.