Re^2: Create parallel database handles... (MCE::Loop)

Replies are listed 'Best First'.
Re^3: Create parallel database handles... (MCE::Loop) by 1nickt (Canon) on Apr 14, 2020 at 03:13 UTC
Hi again perlygapes, The MCE::Loop code just abstracts away all your Parallel::ForkManager logic and improves it, just as Parallel::ForkManager abstracts away and improves some of the tedious manual work of using `fork()` directly. See how the logic is encapsulated in a sub just like in your code, only with less concurrency boilerplate. "using a separate DB connection instead for each child feels intuitively right" I agree, the code I shared keeps a connection open for each child, which itself stays alive and handles multiple jobs from the job list as managed by MCE. Here's a simpler example I've shared recently showing how to parallelize existing code for making a series of HTTP requests. How would you do the same using P::FM? single process `use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my %report; foreach( @urls) { my $start = [gettimeofday]; $ua->get('https://' . $_); $report{$_} = tv_interval($start, [gettimeofday]) ); }); say Dumper \%report;` [download] six processes (workers stay alive, looping through the list, writing to a shared hash) (one added line, two slightly changed lines) use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export; [download] Update: fixed error in first demo code, ++choroba Hope this helps! The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re^4: Create parallel database handles... (MCE::Loop) by perlygapes (Sexton) on May 08, 2020 at 06:10 UTC
Something I just realised that I neglected to mention in my example was that I need to apply CPU affinity in the script. That is, I need to be able to specify that 'worker 1' MUST use CPU0, 'worker 2' MUST use CPU1, etc. This is because I need to have another parallel code block where each worker launches an external single-threaded executable that will be accessing another DB and writing results to a third DB but these MUST NOT access and write to the same table at the same time. This affinity is in essence to avoid access conflicts/violations. How can this be done in MCE? Thanks again.	[reply]
Re^5: Create parallel database handles... (MCE::Loop) by 1nickt (Canon) on May 08, 2020 at 16:44 UTC
Hi again, Now that's a classic XY problem statement! One usually gets better help by asking about how to achieve the goal, not how to implement the technique one has already decided is the way to achieve it ;-) I can think of no reason why one should ever have to concern oneself with which CPU core was used by a given worker. You should be able to write a program where you don't even have to concern yourself with workers. It sounds like from your problem description that you might need some kind of job queue. You can achieve this in many ways, but if you are already using MCE for parallelization, you can use MCE::Flow and MCE::Queue to handle enqueuing jobs based on the output of the first task handled by multiple workers. Look at the demo shown in the MCE::Flow doc. Hope this helps! The way forward always starts with a minimal test.	[reply]
Re^6: Create parallel database handles... (MCE::Loop) by perlygapes (Sexton) on Aug 03, 2020 at 23:49 UTC
Re^7: Create parallel database handles... (MCE::Loop) by marioroy (Prior) on Aug 07, 2020 at 17:07 UTC


laziness, impatience, and hubris
	PerlMonks