|No such thing as a small change|
Re^2: RFC: 100 PDL Exercises (ported from numpy)by vr (Curate)
|on May 08, 2018 at 16:58 UTC||Need Help??|
Hi, bliako, thank you so much for detailed answer, my statistics skills were (hopefully) auto-vivified :). After following your links and code in earnest, I felt brave enough to make some experiments and write a comment, but in the process I discovered something strange ;).
First, my impression is that solutions to exercices were supposed to be simple (as, KISS). So, perhaps to translate, almost verbatim, Python solution to PDL, answer to #100 can be:
Interesting, here, PDL DWIMs for me -- no need to floor an index to thread over a piddle (just as with Perl's array indices). I also stand corrected in "floor converts to Long in-place" -- it rounds in-place, but piddle stays Double.
This 'never to explicitly loop in vectorized language' answer, unfortunately, hides the ugly truth that for very large data we can end with huge R x M matrices of random indices and equally huge (equally unnecessary) matrices of all re-samplings, and thus die because of 'Out of memory!'.
I was experimenting with this or that (PDL's automatic parallelization, in particular), which I'm skipping now, because next is something weird.
Consider this version of the above, which avoids 2-dimensional index matrix and results of re-samplings, but is still un-parallel:
Next is solution where I'm starting to try to parallelize, but because of selected parameters (single thread) I'm not only expecting no gain, but due to overhead it must be slower. And yet:
Why is that? :) I tried to insert
into no-threads solution (does retrieve_pdls set any flags that speed things up? Nope.)