Re: Optimizing with Caching vs. Parallelizing (MCE::Map)

Replies are listed 'Best First'.
Re^2: Optimizing with Caching vs. Parallelizing (MCE::Map) by marioroy (Prior) on Apr 06, 2020 at 15:52 UTC
Hi choroba, I tried using multiple cores for your demonstration. Perl has this capability. So little effort and practically transparent. :) `#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use MCE::Util; use MCE::Map max_workers => MCE::Util::get_ncpu(); sub collatz { my ($start) = @_; my @seq = $start; push @seq, ($seq[-1] / 2, 3 * $seq[-1] + 1)[$seq[-1] % 2] while $seq[-1] != 1; return @seq } my @sizes = mce_map_s { [$_, scalar collatz($_)] } 1, 1e6; say "@$_" for reverse +(sort { $b->[1] <=> $a->[1] } @sizes)[0 .. 19];` [download] Before and after on a 6 core machine with hyper-threading - 12 logical cores. `Serial 48.170 seconds Parallel 9.361 seconds` [download] Regards, Mario	[reply] [d/l] [select]
Re^3: Optimizing with Caching vs. Parallelizing (MCE::Map) by choroba (Cardinal) on Apr 06, 2020 at 16:23 UTC
Interesting! Does mce_map_s return the elements in the corresponding order? As you can see, it's not needed here, so if there's a faster method that doesn't care about the order, we can also switch to that. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l]
Re^4: Optimizing with Caching vs. Parallelizing (MCE::Map) by marioroy (Prior) on Apr 06, 2020 at 18:19 UTC
Yes, MCE::Map returns the elements in the corresponding order, similarly to map. Actually, there is not much going on for the manager process - very efficient and not to worry about MCE::Map having to preserve order. I compared to MCE::Loop (not ordered) and sometimes MCE::Map is faster other times MCE::Loop. No clear winner. Here is the version using MCE::Loop. `#!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use MCE::Util; use MCE::Loop max_workers => MCE::Util::get_ncpu(); sub collatz { my ($start) = @_; my @seq = $start; push @seq, ($seq[-1] / 2, 3 * $seq[-1] + 1)[$seq[-1] % 2] while $seq[-1] != 1; return @seq } my @sizes = mce_loop_s { my @results; push @results, [$_, scalar collatz($_)] for @{ $_[1] }; MCE->gather(@results); } 1, 1e6; say "@$_" for reverse +(sort { $b->[1] <=> $a->[1] } @sizes)[0 .. 19];` [download] `MCE::Map 9.392 seconds MCE::Loop 9.396 seconds` [download] Regards, Mario	[reply] [d/l] [select]
Re^4: Optimizing with Caching vs. Parallelizing (MCE::Map) (Updated) by 1nickt (Canon) on Apr 06, 2020 at 18:49 UTC
Hi choroba, The three MCE::Map exported functions, `mce_map`, `mce_map_f` and `mce_map_s` all return orderly output (since they mimic, well, `map`). (You can also get ordered output from a non-`map` function using an orderly `gather()` function such as the ones provided in MCE::Candy.) You could implement a parallelized version of your code without orderly output just using MCE::Flow: use warnings; use strict; use feature 'say'; use MCE::Util; use MCE::Flow; sub collatz { my ($start) = @_; my @seq = $start; push @seq, ($seq[-1] / 2, 3 * $seq[-1] + 1)[$seq[-1] % 2] while $seq[-1] != 1; return @seq; } my @sizes; mce_flow_s { max_workers => MCE::Util::get_ncpu(), bounds_only => 1, gather => \@sizes, }, sub { my ($mce, $chunk, $chunk_id ) = @_; my ($start, $end) = @$chunk; my @chunk_sizes; push @chunk_sizes, [$_, scalar collatz($_)] for $start .. $end; MCE->gather( @chunk_sizes ); }, 1, 1e6; say "@$_" for reverse +(sort { $b->[1] <=> $a->[1] } @sizes)[0..19]; [download] This runs a tiny bit faster on my system, as you might expect. But it's more code than using MCE::Map, as you can see. Update: But with a non-ordered input sequence, such as `keys %hash`, or where processing time for each element is likely to vary, this may be a more significant factor. Also note that you can call `gather()` multiple times from within the user sub being processed by `MCE`, without returning, so you can view or handle the output as it is produced (by specifying a callback for your gatherer). And then there's MCE::Stream ... `;-)` `$ time perl ch-map.pl 922525 445 922524 445 906175 445 886953 445 615017 447 410011 449 820023 450 820022 450 818943 450 546681 452 970599 458 796095 468 767903 468 511935 470 927003 476 910107 476 704623 504 939497 507 626331 509 837799 525 real 0m4.843s user 0m47.423s sys 0m0.265s` [download] `time perl ch-flow.pl 922525 445 922524 445 906175 445 886953 445 615017 447 410011 449 820023 450 820022 450 818943 450 546681 452 970599 458 796095 468 767903 468 511935 470 927003 476 910107 476 704623 504 939497 507 626331 509 837799 525 real 0m4.661s user 0m47.057s sys 0m0.255s` [download] Hope this is of interest! The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re^5: Optimizing with Caching vs. Parallelizing (MCE::Map) (Updated) by marioroy (Prior) on Apr 07, 2020 at 00:48 UTC


Think about Loose Coupling
	PerlMonks