Re: parallelising processes
by BrowserUk (Patriarch) on Feb 05, 2011 at 13:44 UTC
|
#! perl -slw
use strict;
use threads;
use threads::shared;
my $running :shared = 0;
sub thread {
{ lock $running; ++$running }
my $subdir = shift;
chdir $subdir;
system q[some command ];
{ lock $running; --$running }
}
my @subdirs = ...;
for my $subdir ( @subdirs ) {
async( \&thread, $subdir );
sleep 1 while $running >= 8;
$_->join for threads->list(threads::joinable);
}
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
Very sorry for the late thanks. Another project got priority, but got back on the old one and your idea works great, exactly what I was looking for.
Thanks!
| [reply] |
Re: parallelising processes
by Anonyrnous Monk (Hermit) on Feb 05, 2011 at 13:18 UTC
|
| [reply] |
Re: parallelising processes
by chrestomanci (Priest) on Feb 05, 2011 at 19:53 UTC
|
#!/usr/bin/perl
use Parallel::ForkManager;
my $max_threads = 8;
my $fork_mgr = new Parallel::ForkManager($max_threads);
$fork_mgr->run_on_finish
(
sub
{
my($child_pid, $exit_code, $child_id, $exit_signal, $core_dump
+) = @_;
# Code to store the results from each thread go here.
}
);
ITEM: foreach my $item (@big_list)
{
# Fork of child threads. The 'and next' clause will only run
# in the parent where start returns the child pid (non zero)
$fork_mgr->start($url_ent) and next ITEM;
# Code in this block will run in parallel
my $result = do_stuff($item);
# Store the final result. The value you pass to finish will be
# received by the sub you defined in run_on_finish
$fork_mgr->finish($result);
}
$fork_mgr->wait_all_children();
# Now all child threads have finished,
# your results should be available.
A few extra tips:
If you are running this code in the perl debugger, then you might want to debug the child threads. If so, run your program in a unix/linux xterm window. The debugger will create new xterm windows for each child thread, so you can separately step through the parent and the children
Conversely, if you don't want to step through the children, and your screen is filling up with windows from child threads, you can use the debug option: o inhibit_exit=0 To suppress the display of windows for child threads that finish without hitting a breakpoint. | [reply] [d/l] [select] |
|
Thanks for your answer, but unfortunately I am not allowed to install Parallel::ForkManager in that environment. Hence to question how to imitate it.
| [reply] |
|
| [reply] |
|
Re: parallelising processes
by zwon (Abbot) on Feb 05, 2011 at 17:14 UTC
|
use strict;
use warnings;
use autodie;
my $processes = 0;
my @ids = 1 .. 40;
for my $id (@ids) {
if ( $processes >= 8 ) {
wait;
$processes--;
}
if (fork) {
$processes++;
}
else {
chdir $id;
exec $command, $id;
}
}
wait while $processes--;
| [reply] [d/l] |
|
Yep, also a good solution, Thank You!
| [reply] |
Re: parallelising processes
by tomfahle (Priest) on Feb 06, 2011 at 17:17 UTC
|
Have a look at Parallel::Iterator, too.
You can set the number of parallel workers with the option workers:
use strict;
use warnings;
# Number of parallel tasks
my %options = ();
$options{workers} = 8;
# Execute tasks in parallel
my @results = iterate_as_array( \%options,
$worker, \@tasks );
Hth, Thomas
| [reply] [d/l] |
|
Also have a look at Parallel::Forker, I use it all the time it works great exactly as you expect and has advantages over Parallel::ForkManager like child process signalling.
use Parallel::Forker;
my $forker = Parallel::Forker->new(use_sig_child => 1, max_proc => 8);
$SIG{CHLD} = sub { Parallel::Forker::sig_child($forker); };
$SIG{TERM} = sub { $forker->kill_tree_all('TERM') if $forker && $forke
+r->in_parent; };
for (1..10) {
$forker->schedule(run_on_start => sub {
# do child process code here
})->ready();
}
# wait for all child processes to finish
$forker->wait_all();
hth | [reply] [d/l] |
|
Thank for your answer, unfortunately Parallel::Iterator is also not available in that environment.
| [reply] |
Re: parallelising processes
by sundialsvc4 (Abbot) on Feb 07, 2011 at 18:08 UTC
|
/me nods...
This sounds like just the ticket for one of the fork-managers now being discussed. A “pool” of, say, 8 different threads would thus be set up, and each one of them would do the same thing:
- Retrieve the next unit of work from a queue (which, of course, can be arbitrarily large).
-
Process the unit of work and place the completion notification on another queue (or otherwise let the world know that this unit-of-work is done).
-
Rinse and repeat. (Eventually, the work peters out and all of the threads go dormant ... or, if you prefer, they graciously depart from the land of the living.)
The actual number of threads to be launched would, of course, be an adjustable parameter. If you know that you have 8 processors or cores that probably don’t have anything better to do with their time, “8” would be a good starting point. You could then do some careful experimenting and measuring to see what the “sweet spot” for your particular setup turns out to be.
This is such a common requirement that you don’t need to look forward to “writing” anything ... you will just “choose one.”
| |
Re: parallelising processes
by salva (Canon) on Apr 07, 2011 at 08:46 UTC
|
use Proc::Queue size => 8;
for my $dir (@dirs) {
run_back {
chdir $dir and exec "foo_matrix";
};
}
1 while wait > 0;
Proc::Queue is a single file module that can be installed just dropping it in some place under your home (for instance, as /home/robert/lib/perl/Proc/Queue.pm, and then on your script use lib '/home/robert/lib/perl';) | [reply] [d/l] [select] |