http://qs321.pair.com?node_id=11116585


in reply to Re^4: Create parallel database handles... (MCE::Loop)
in thread Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

Hi again,

Now that's a classic XY problem statement! One usually gets better help by asking about how to achieve the goal, not how to implement the technique one has already decided is the way to achieve it ;-)

I can think of no reason why one should ever have to concern oneself with which CPU core was used by a given worker. You should be able to write a program where you don't even have to concern yourself with workers.

It sounds like from your problem description that you might need some kind of job queue. You can achieve this in many ways, but if you are already using MCE for parallelization, you can use MCE::Flow and MCE::Queue to handle enqueuing jobs based on the output of the first task handled by multiple workers. Look at the demo shown in the MCE::Flow doc.

Hope this helps!


The way forward always starts with a minimal test.
  • Comment on Re^5: Create parallel database handles... (MCE::Loop)

Replies are listed 'Best First'.
Re^6: Create parallel database handles... (MCE::Loop)
by perlygapes (Sexton) on Aug 03, 2020 at 23:49 UTC
    I don't know why I missed your answer, but thanks again very much. Sorry it took me so long to respond.
    Yes, there is a very specific reason why I want to create CPU affinity: I am using this script to launch multiple instances of an old single threaded application and each instance is going to be working on the same overall dataset, but the dataset is a collection of files, and I do not want any file clobbering. I am specifically and purposefully trying to eliminate any chance of one of these processes interfering (even just reading) the same file another process is currently working on.

    Sorry if I seem to you to be asking basic questions - I am a plumber by trade...teaching oneself to code is very difficult.

    I tried amending your example slightly to this - and it does not give the result I expect with regard to the $process values:

    #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $process = MCE::Shared->scalar; $process = 0; MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; $process++; say $process."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

    The output reads:
    1->GETting https://gap.com 1->GETting https://amazon.com 1->GETting https://ebay.com 1->GETting https://lego.com 1->GETting https://wunderground.com 1->GETting https://imdb.com 2->GETting https://underarmour.com 2->GETting https://disney.com 2->GETting https://espn.com 3->GETting https://dailymail.com $VAR1 = bless( { 'disney.com' => '1.15682', 'amazon.com' => '4.607657', 'wunderground.com' => '0.46855', 'dailymail.com' => '2.355818', 'espn.com' => '1.170818', 'gap.com' => '3.819699', 'ebay.com' => '1.479624', 'underarmour.com' => '2.919818', 'imdb.com' => '2.540127', 'lego.com' => '0.919592' }, 'MCE::Shared::Hash' );
    whereas I had expected:
    1->GETting https://gap.com 2->GETting https://amazon.com 3->GETting https://ebay.com 4->GETting https://lego.com 5->GETting https://wunderground.com 6->GETting https://imdb.com 7->GETting https://underarmour.com 8->GETting https://disney.com 9->GETting https://espn.com 10->GETting https://dailymail.com $VAR1 = bless( { 'disney.com' => '1.15682', 'amazon.com' => '4.607657', 'wunderground.com' => '0.46855', 'dailymail.com' => '2.355818', 'espn.com' => '1.170818', 'gap.com' => '3.819699', 'ebay.com' => '1.479624', 'underarmour.com' => '2.919818', 'imdb.com' => '2.540127', 'lego.com' => '0.919592' }, 'MCE::Shared::Hash' );

    Can you explain this?

    Thanks.

      Greetings, perlygapes

      The shared variable is constructed using OO, not via the TIE interface. Therefore, assigning to 0 overwrites the variable. Incrementing is possible via $process->incr. Another way is MCE->chunk_id (2nd example). MCE::Mutex is helpful for one worker to access a resource, blocking others (3rd example).

      $process->incr()

      #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $process = MCE::Shared->scalar(0); MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say $process->incr()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

      MCE->chunk_id()

      #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say MCE->chunk_id()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

      Assessing 3rd DB -- one worker

      use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; use MCE::Mutex; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $mutex = MCE::Mutex->new(); MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say MCE->chunk_id()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); # access 3rd DB, one worker $mutex->lock; # update ... $mutex->unlock; # ditto $mutex->enter( sub { # update ... }); }); say Dumper $report->export;

      Regards, Mario