Re^4: RFC: 100 PDL Exercises (ported from numpy)

by marioroy (Vicar)
on Sep 03, 2019 at 05:25 UTC

in reply to Re^3: RFC: 100 PDL Exercises (ported from numpy)
in thread RFC: 100 PDL Exercises (ported from numpy)

Here is the same thing using MCE. Workers obtain the next sequence number without involving the manager process. Thus, the reason why it runs faster. I had to think about it when I saw the run time.

# use strict; use warnings; use feature 'say'; use PDL; # must load PDL before MCE::Shared use MCE 1.847; use MCE::Shared 1.847; use Time::HiRes 'time'; srand( 123 ); my $time = time; my $n = 30000; # input sample size my $m = 10000; # number of bootstrap repeats my $r = $n; # re-sample size # On Windows, the non-shared piddle ($x) is unblessed in threads. # Therefore, constructing the piddle inside the worker. UNIX # platforms benefit from copy-on-write. Thus, one copy. my $x = ( $^O eq 'MSWin32' ) ? undef : random( $n ); my $avg = MCE::Shared->pdl_zeroes( $m ); MCE->new( max_workers => 4, sequence => [ 0, $m - 1 ], chunk_size => 1, user_begin => sub { $x = random( $n ) unless ( defined $x ); }, user_func => sub { my $idx = random $r; $idx *= $n; # $avg is a shared piddle which resides inside the shared- # manager process or thread. The piddle is accessible via the # OO interface only. $avg->set( $_, $x->index( $idx )->avg ); } )->run; # MCE sets the seed of the base generator uniquely between workers. # Unfortunately, it requires running with one worker for predictable # results (i.e. no guarantee in the order which worker computes the # next input chunk). say $avg->pctover( pdl 0.05, 0.95 ); say time - $time, ' seconds'; __END__ # Output [0.49387106 0.4993768] 1.09556317329407 seconds

Thank you, vr. I had no idea that PDL random is not unique between threads. MCE already sets the seed of the base generator, but did not do so for workers spawned as threads. This is resolved in MCE 1.847.

Regards, Mario

Node Type: note
Node Type: note [id://11105498]
