Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^7: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run

by marioroy (Prior)
on Apr 09, 2017 at 06:45 UTC ( [id://1187509]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
in thread Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run

Hi 1nickt,

To get the default behavior, one can specify the on_post_exit option. The status code for __DIE__ is 255 typically.

MCE::Loop->init( max_workers => 2, chunk_size => 1, user_begin => sub { $SIG{'INT'} = sub { my $signal = shift; say "Hello from $signal: $$"; MCE->exit(0); }; }, on_post_exit => sub { my ($mce, $e) = @_; if ($e->{status} == 255) { MCE::Signal::stop_and_exit('__DIE__'); } } );

More info on on_post_exit is found here. The die handler for MCE workers is found inside MCE::Core::Worker ( ~ line 649 ). I cannot change the MCE->exit(...) line to MCE::Signal::stop_and_exit('__DIE__'). That will break scripts where MCE is called from inside an eval block.

local $SIG{__DIE__} = sub { ... local $SIG{__DIE__}; local $\ = undef; my $_die_msg = (defined $_[0]) ? $_[0] : ''; print {*STDERR} $_die_msg; $self->exit(255, $_die_msg, $self->{_chunk_id}); };

TODO: When on_post_exit is not specified, have MCE workers abort input due to uncaught exception. Revisit eval. I was unable to get $@ to stick at the manager level. To make this work, I need to call die with the error obtained from the worker at the manager level.

eval { mce_loop { ... } @input }; # TODO: Today, $@ is not set at the manager level. # Thus, the eval block succeeds. Will fix this. if ( $@ ) { ... }

Fortunately, one has control with the on_post_exit handler on what to do: e.g. restart_worker, stop_and_exit.

  • Comment on Re^7: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
  • Select or Download Code

Replies are listed 'Best First'.
Re^8: Interrupt multi-process program while using MCE::Shared hash: END block code does not (all) run
by 1nickt (Canon) on Apr 09, 2017 at 19:56 UTC

    Hi ++marioroy,

    That's very cool. Now when one of the workers encounters a fatal exception the whole program ends.

    But, of course, one has no guarantee of which tasks might have been completed (especially in a real-world scenario where task execution time varies and there are more than just two workers). So even with the program exiting via __DIE__, one can easily wind up with not only a partially populated hash/cache (as expected on early exit), but a hash partially populated *out of sequence* compared to the array being processed by mce_loop. E.g.:

    Parent PID 21987 worker 2 (21990) processing chunk 1 worker 1 (21989) processing chunk 2 worker 1 (21989) processing chunk 4 worker 2 (21990) processing chunk 3 worker 1 (21989) processing chunk 6 worker 2 (21990) processing chunk 5 Illegal division by zero at mce12.pl line 35, <__ANONIO__> line 6. Hello from END block: 21990 ## mce12.pl: caught signal (__DIE__), exiting Hello from INT: 21989 Hello from END block: 21989 Hello from END block: 21987 Parent in END: $VAR1 = { '00 21990' => '1491767068', '01 21989' => '1491767068', '02 21990' => '1491767070', '03 21989' => '1491767070', '05 21989' => '1491767072' };
    That was of course completely foreseeable, but I hadn't thought about it when asking for default DIE behaviour. Now, since it makes no difference in terms of the issue I first thought of (patchy-incomplete results), I think I may be more likely to favour the previous behaviour; in other words, continue processing even when one worker dies unexpectedly. Haha, sorry! Well, I think both choices are valuable and most needed, actually.

    Thanks again.


    The way forward always starts with a minimal test.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1187509]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-26 01:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found