http://qs321.pair.com?node_id=11115238


in reply to Re^2: Optimizing with Caching vs. Parallelizing (MCE::Map) (Better caching?)
in thread Optimizing with Caching vs. Parallelizing (MCE::Map)

Hi marioroy,

I think I'm having an issue with MCE on Windows here, timing your code (collatz_vr):

use Time::HiRes 'time'; my $t = time; # the whole script here say time - $t; MCE::Flow->finish; say time - $t; __END__ 1 worker: 6.01482510566711 7.76495599746704 2 workers: 4.12953305244446 7.07751798629761 4 workers: 3.33010196685791 8.4802930355072

1st measurement approximately matches your output, but 1.5 - 2 seconds per worker to shutdown doesn't look OK to me.

For vr's demo, every worker starts with an empty cache. Meaning that workers do not have cached results from prior chunks. This is the reason not scaling as well versus the non-cache demonstrations

In other words, lots and lots of work is needlessly duplicated.

_cache_collatz( $_ ) for 1 .. 1e6; say scalar %cache; %cache = (); _cache_collatz( $_ ) for 1 + 4e5 .. 1e6; say scalar %cache; __END__ 2168611 2168611

So, effectively, in situation with e.g. 10 workers, 4 junior workers, filling cache for ranges up to 400_000, are free to slack off and not gather their results at all, and there's large amount of overlap in work of 6 senior workers, too. For now, I have no solid idea how to parallelize this algorithm efficiently.