parallelising processes

RobertCraven has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: parallelising processes by BrowserUk (Patriarch) on Feb 05, 2011 at 13:44 UTC
Something like this (untested) would do the job: `#! perl -slw use strict; use threads; use threads::shared; my $running :shared = 0; sub thread { { lock $running; ++$running } my $subdir = shift; chdir $subdir; system q[some command ]; { lock $running; --$running } } my @subdirs = ...; for my $subdir ( @subdirs ) { async( \&thread, $subdir ); sleep 1 while $running >= 8; $_->join for threads->list(threads::joinable); }` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: parallelising processes by RobertCraven (Sexton) on Apr 07, 2011 at 07:52 UTC
Very sorry for the late thanks. Another project got priority, but got back on the old one and your idea works great, exactly what I was looking for. Thanks!	[reply]
Re: parallelising processes by Anonyrnous Monk (Hermit) on Feb 05, 2011 at 13:18 UTC
Maybe Running parallel processes without communication helps. Also, in what way did the forking (or Parallel::ForkManager) fail? What was the code you used? What platform are you on? You could also try threads.	[reply]
Re: parallelising processes by chrestomanci (Priest) on Feb 05, 2011 at 19:53 UTC
There was a lengthy thread on this back in november. In my reply I posted a code example of how to use Parallel::ForkManager. Here it is again: #!/usr/bin/perl use Parallel::ForkManager; my $max_threads = 8; my $fork_mgr = new Parallel::ForkManager($max_threads); $fork_mgr->run_on_finish ( sub { my($child_pid, $exit_code, $child_id, $exit_signal, $core_dump +) = @_; # Code to store the results from each thread go here. } ); ITEM: foreach my $item (@big_list) { # Fork of child threads. The 'and next' clause will only run # in the parent where start returns the child pid (non zero) $fork_mgr->start($url_ent) and next ITEM; # Code in this block will run in parallel my $result = do_stuff($item); # Store the final result. The value you pass to finish will be # received by the sub you defined in run_on_finish $fork_mgr->finish($result); } $fork_mgr->wait_all_children(); # Now all child threads have finished, # your results should be available. [download] A few extra tips: If you are running this code in the perl debugger, then you might want to debug the child threads. If so, run your program in a unix/linux xterm window. The debugger will create new xterm windows for each child thread, so you can separately step through the parent and the children Conversely, if you don't want to step through the children, and your screen is filling up with windows from child threads, you can use the debug option: `o inhibit_exit=0` To suppress the display of windows for child threads that finish without hitting a breakpoint.	[reply] [d/l] [select]
Re^2: parallelising processes by RobertCraven (Sexton) on Apr 07, 2011 at 07:58 UTC
Thanks for your answer, but unfortunately I am not allowed to install Parallel::ForkManager in that environment. Hence to question how to imitate it.	[reply]
Re^3: parallelising processes by Corion (Patriarch) on Apr 07, 2011 at 08:00 UTC
As Parallel::ForkManager is under the Artistic license, you can just copy the code into your project. Also see Yes, even you can use CPAN.	[reply]
Re^4: parallelising processes by RobertCraven (Sexton) on Apr 07, 2011 at 08:10 UTC
Re: parallelising processes by zwon (Abbot) on Feb 05, 2011 at 17:14 UTC
What's the problem with fork? It should be quite simple: `use strict; use warnings; use autodie; my $processes = 0; my @ids = 1 .. 40; for my $id (@ids) { if ( $processes >= 8 ) { wait; $processes--; } if (fork) { $processes++; } else { chdir $id; exec $command, $id; } } wait while $processes--;` [download]	[reply] [d/l]
Re^2: parallelising processes by RobertCraven (Sexton) on Apr 07, 2011 at 08:14 UTC
Yep, also a good solution, Thank You!	[reply]
Re: parallelising processes by tomfahle (Priest) on Feb 06, 2011 at 17:17 UTC
Have a look at Parallel::Iterator, too. You can set the number of parallel workers with the option workers: `use strict; use warnings; # Number of parallel tasks my %options = (); $options{workers} = 8; # Execute tasks in parallel my @results = iterate_as_array( \%options, $worker, \@tasks );` [download] Hth, Thomas	[reply] [d/l]
Re^2: parallelising processes by hermida (Scribe) on Feb 06, 2011 at 21:29 UTC
Also have a look at Parallel::Forker, I use it all the time it works great exactly as you expect and has advantages over Parallel::ForkManager like child process signalling. `use Parallel::Forker; my $forker = Parallel::Forker->new(use_sig_child => 1, max_proc => 8); $SIG{CHLD} = sub { Parallel::Forker::sig_child($forker); }; $SIG{TERM} = sub { $forker->kill_tree_all('TERM') if $forker && $forke +r->in_parent; }; for (1..10) { $forker->schedule(run_on_start => sub { # do child process code here })->ready(); } # wait for all child processes to finish $forker->wait_all();` [download] hth	[reply] [d/l]
Re^2: parallelising processes by RobertCraven (Sexton) on Apr 07, 2011 at 08:02 UTC
Thank for your answer, unfortunately Parallel::Iterator is also not available in that environment.	[reply]
Re: parallelising processes by sundialsvc4 (Abbot) on Feb 07, 2011 at 18:08 UTC
`/me nods...` This sounds like just the ticket for one of the fork-managers now being discussed. A “pool” of, say, 8 different threads would thus be set up, and each one of them would do the same thing: Retrieve the next unit of work from a queue (which, of course, can be arbitrarily large). Process the unit of work and place the completion notification on another queue (or otherwise let the world know that this unit-of-work is done). Rinse and repeat. (Eventually, the work peters out and all of the threads go dormant ... or, if you prefer, they graciously depart from the land of the living.) The actual number of threads to be launched would, of course, be an adjustable parameter. If you know that you have 8 processors or cores that probably don’t have anything better to do with their time, “8” would be a good starting point. You could then do some careful experimenting and measuring to see what the “sweet spot” for your particular setup turns out to be. This is such a common requirement that you don’t need to look forward to “writing” anything ... you will just “choose one.”
Re: parallelising processes by salva (Canon) on Apr 07, 2011 at 08:46 UTC
`use Proc::Queue size => 8; for my $dir (@dirs) { run_back { chdir $dir and exec "foo_matrix"; }; } 1 while wait > 0;` [download] Proc::Queue is a single file module that can be installed just dropping it in some place under your home (for instance, as `/home/robert/lib/perl/Proc/Queue.pm`, and then on your script `use lib '/home/robert/lib/perl';`)	[reply] [d/l] [select]


Just another Perl shrine
	PerlMonks