http://qs321.pair.com?node_id=11120272


in reply to Re^5: Create parallel database handles... (MCE::Loop)
in thread Create parallel database handles or SQL statements for multi-threaded/process access to Postgres DB using DBI, DBD::Pg and Parallel::ForkManager

I don't know why I missed your answer, but thanks again very much. Sorry it took me so long to respond.
Yes, there is a very specific reason why I want to create CPU affinity: I am using this script to launch multiple instances of an old single threaded application and each instance is going to be working on the same overall dataset, but the dataset is a collection of files, and I do not want any file clobbering. I am specifically and purposefully trying to eliminate any chance of one of these processes interfering (even just reading) the same file another process is currently working on.

Sorry if I seem to you to be asking basic questions - I am a plumber by trade...teaching oneself to code is very difficult.

I tried amending your example slightly to this - and it does not give the result I expect with regard to the $process values:

#!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $process = MCE::Shared->scalar; $process = 0; MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; $process++; say $process."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

The output reads:
1->GETting https://gap.com 1->GETting https://amazon.com 1->GETting https://ebay.com 1->GETting https://lego.com 1->GETting https://wunderground.com 1->GETting https://imdb.com 2->GETting https://underarmour.com 2->GETting https://disney.com 2->GETting https://espn.com 3->GETting https://dailymail.com $VAR1 = bless( { 'disney.com' => '1.15682', 'amazon.com' => '4.607657', 'wunderground.com' => '0.46855', 'dailymail.com' => '2.355818', 'espn.com' => '1.170818', 'gap.com' => '3.819699', 'ebay.com' => '1.479624', 'underarmour.com' => '2.919818', 'imdb.com' => '2.540127', 'lego.com' => '0.919592' }, 'MCE::Shared::Hash' );
whereas I had expected:
1->GETting https://gap.com 2->GETting https://amazon.com 3->GETting https://ebay.com 4->GETting https://lego.com 5->GETting https://wunderground.com 6->GETting https://imdb.com 7->GETting https://underarmour.com 8->GETting https://disney.com 9->GETting https://espn.com 10->GETting https://dailymail.com $VAR1 = bless( { 'disney.com' => '1.15682', 'amazon.com' => '4.607657', 'wunderground.com' => '0.46855', 'dailymail.com' => '2.355818', 'espn.com' => '1.170818', 'gap.com' => '3.819699', 'ebay.com' => '1.479624', 'underarmour.com' => '2.919818', 'imdb.com' => '2.540127', 'lego.com' => '0.919592' }, 'MCE::Shared::Hash' );

Can you explain this?

Thanks.

Replies are listed 'Best First'.
Re^7: Create parallel database handles... (MCE::Loop)
by marioroy (Prior) on Aug 07, 2020 at 17:07 UTC

    Greetings, perlygapes

    The shared variable is constructed using OO, not via the TIE interface. Therefore, assigning to 0 overwrites the variable. Incrementing is possible via $process->incr. Another way is MCE->chunk_id (2nd example). MCE::Mutex is helpful for one worker to access a resource, blocking others (3rd example).

    $process->incr()

    #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $process = MCE::Shared->scalar(0); MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say $process->incr()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

    MCE->chunk_id()

    #!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say MCE->chunk_id()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); }); say Dumper $report->export;

    Assessing 3rd DB -- one worker

    use strict; use warnings; use 5.010; use Data::Dumper; use HTTP::Tiny; use Time::HiRes 'gettimeofday', 'tv_interval'; use MCE; use MCE::Shared; use MCE::Mutex; my $ua = HTTP::Tiny->new( timeout => 10 ); my @urls = qw< gap.com amazon.com ebay.com lego.com wunderground.com imdb.com underarmour.com disney.com espn.com dailymail.com >; my $report = MCE::Shared->hash; my $mutex = MCE::Mutex->new(); MCE->new( max_workers => 6 )->foreach( \@urls, sub { my $start = [gettimeofday]; say MCE->chunk_id()."->GETting https://".$_; $ua->get('https://' . $_); $report->set( $_, tv_interval($start, [gettimeofday]) ); # access 3rd DB, one worker $mutex->lock; # update ... $mutex->unlock; # ditto $mutex->enter( sub { # update ... }); }); say Dumper $report->export;

    Regards, Mario