Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

parallelising processes

by RobertCraven (Sexton)
on Feb 05, 2011 at 13:00 UTC ( [id://886390]=perlquestion: print w/replies, xml ) Need Help??

RobertCraven has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I seek enlightenment about how to run a subroutine several times parallel. The idea is to imitate Parallel::ForkManager which I can't use in that setup. I have a set of matrices to analyse and would like to run one analysis on each processor.

The machine I am working on at the moment has 8 processor and I have 40 matrices. So I'd like to start analysing 8 and once one is finished the next run should start.

I started looking into forking, but failed at the IPC.

The only thing that is happening is that a subroutine that takes only an ID as argument, changes into a directory with that name and executes a binary. Once that program is finished, the next run should be initiated, but never more than 8 at a time.

Many thanks for any help / suggestions

Replies are listed 'Best First'.
Re: parallelising processes
by BrowserUk (Patriarch) on Feb 05, 2011 at 13:44 UTC

    Something like this (untested) would do the job:

    #! perl -slw use strict; use threads; use threads::shared; my $running :shared = 0; sub thread { { lock $running; ++$running } my $subdir = shift; chdir $subdir; system q[some command ]; { lock $running; --$running } } my @subdirs = ...; for my $subdir ( @subdirs ) { async( \&thread, $subdir ); sleep 1 while $running >= 8; $_->join for threads->list(threads::joinable); }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Very sorry for the late thanks. Another project got priority, but got back on the old one and your idea works great, exactly what I was looking for.

      Thanks!

Re: parallelising processes
by Anonyrnous Monk (Hermit) on Feb 05, 2011 at 13:18 UTC
Re: parallelising processes
by chrestomanci (Priest) on Feb 05, 2011 at 19:53 UTC

    There was a lengthy thread on this back in november.

    In my reply I posted a code example of how to use Parallel::ForkManager. Here it is again:

    #!/usr/bin/perl use Parallel::ForkManager; my $max_threads = 8; my $fork_mgr = new Parallel::ForkManager($max_threads); $fork_mgr->run_on_finish ( sub { my($child_pid, $exit_code, $child_id, $exit_signal, $core_dump +) = @_; # Code to store the results from each thread go here. } ); ITEM: foreach my $item (@big_list) { # Fork of child threads. The 'and next' clause will only run # in the parent where start returns the child pid (non zero) $fork_mgr->start($url_ent) and next ITEM; # Code in this block will run in parallel my $result = do_stuff($item); # Store the final result. The value you pass to finish will be # received by the sub you defined in run_on_finish $fork_mgr->finish($result); } $fork_mgr->wait_all_children(); # Now all child threads have finished, # your results should be available.

    A few extra tips:

    If you are running this code in the perl debugger, then you might want to debug the child threads. If so, run your program in a unix/linux xterm window. The debugger will create new xterm windows for each child thread, so you can separately step through the parent and the children

    Conversely, if you don't want to step through the children, and your screen is filling up with windows from child threads, you can use the debug option: o inhibit_exit=0 To suppress the display of windows for child threads that finish without hitting a breakpoint.

      Thanks for your answer, but unfortunately I am not allowed to install Parallel::ForkManager in that environment. Hence to question how to imitate it.
Re: parallelising processes
by zwon (Abbot) on Feb 05, 2011 at 17:14 UTC

    What's the problem with fork? It should be quite simple:

    use strict; use warnings; use autodie; my $processes = 0; my @ids = 1 .. 40; for my $id (@ids) { if ( $processes >= 8 ) { wait; $processes--; } if (fork) { $processes++; } else { chdir $id; exec $command, $id; } } wait while $processes--;
      Yep, also a good solution, Thank You!
Re: parallelising processes
by tomfahle (Priest) on Feb 06, 2011 at 17:17 UTC

    Have a look at Parallel::Iterator, too.

    You can set the number of parallel workers with the option workers:

    use strict; use warnings; # Number of parallel tasks my %options = (); $options{workers} = 8; # Execute tasks in parallel my @results = iterate_as_array( \%options, $worker, \@tasks );

    Hth, Thomas

      Also have a look at Parallel::Forker, I use it all the time it works great exactly as you expect and has advantages over Parallel::ForkManager like child process signalling.
      use Parallel::Forker; my $forker = Parallel::Forker->new(use_sig_child => 1, max_proc => 8); $SIG{CHLD} = sub { Parallel::Forker::sig_child($forker); }; $SIG{TERM} = sub { $forker->kill_tree_all('TERM') if $forker && $forke +r->in_parent; }; for (1..10) { $forker->schedule(run_on_start => sub { # do child process code here })->ready(); } # wait for all child processes to finish $forker->wait_all();
      hth
      Thank for your answer, unfortunately Parallel::Iterator is also not available in that environment.
Re: parallelising processes
by sundialsvc4 (Abbot) on Feb 07, 2011 at 18:08 UTC

    /me nods...

    This sounds like just the ticket for one of the fork-managers now being discussed.   A “pool” of, say, 8 different threads would thus be set up, and each one of them would do the same thing:

    1. Retrieve the next unit of work from a queue (which, of course, can be arbitrarily large).
    2. Process the unit of work and place the completion notification on another queue (or otherwise let the world know that this unit-of-work is done).
    3. Rinse and repeat.   (Eventually, the work peters out and all of the threads go dormant ... or, if you prefer, they graciously depart from the land of the living.)

    The actual number of threads to be launched would, of course, be an adjustable parameter.   If you know that you have 8 processors or cores that probably don’t have anything better to do with their time, “8” would be a good starting point.   You could then do some careful experimenting and measuring to see what the “sweet spot” for your particular setup turns out to be.

    This is such a common requirement that you don’t need to look forward to “writing” anything ... you will just “choose one.”

Re: parallelising processes
by salva (Canon) on Apr 07, 2011 at 08:46 UTC
    use Proc::Queue size => 8; for my $dir (@dirs) { run_back { chdir $dir and exec "foo_matrix"; }; } 1 while wait > 0;
    Proc::Queue is a single file module that can be installed just dropping it in some place under your home (for instance, as /home/robert/lib/perl/Proc/Queue.pm, and then on your script use lib '/home/robert/lib/perl';)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://886390]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-25 17:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found