Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Using flag files to monitor cluster jobs

by bwelch (Curate)
on Oct 26, 2005 at 20:45 UTC ( [id://503164]=perlquestion: print w/replies, xml ) Need Help??

bwelch has asked for the wisdom of the Perl Monks concerning the following question:

Starting with a script does around a dozen lengthy analysis tasks, so I'm trying to update it to use a cluster and do the jobs in parallel. Each analysis script is submitted with a load sharing facility (LSF) Using LSF to monitor the analysis jobs is frowned upon, as that tends to put a load on LSF and keep it from doing more important things. So, I need to use flag files that tell me when each analysis is done.

This submits a job and waits for completion:

use strict; my $jobID = `bsub -q long -o $cl_log -J seqPipe "$jobCmd"`; print "Waiting for analysis work.\n"; while (1){ last if ( -e "$resultsDir/analysis_completed" ); print LOG "."; sleep 15; } print "\nFound completion flag. Analysis jobs are done.\n";

Assuming each analysis job creates its own flag to indicate completion, this will get more complicated when I'm trying to monitor a dozen jobs. Later on, there's the possibliity that some jobs will have their own set of analysis jobs that they need to monitor.

I'm thinking this while statement will end up with a set of tests, one for each job. Still, it seems inefficient to keep testing for all those files until the last one has finished. Any ideas on better ways to grow this system?

Replies are listed 'Best First'.
Re: Using flag files to monitor cluster jobs
by BrowserUk (Patriarch) on Oct 26, 2005 at 21:15 UTC

    Stick the names you are looking for in an array and splice them out as they are found. When the array is empty, all your jobs are done.

    Update: See benizi's post below for an important correction to this untested logic.

    my @dirs = qw[ ... ]; my @rFiles = map{ "$_/analysis_completed" } @dirs; while( @rFiles and sleep 15 ) { -e $rFiles[ $_ ] and splice @rFiles, $_ for 0 .. $#rFiles; } print "All done";

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      When using splice in a loop like that, don't forget to reverse the indices. (Splicing at index N doesn't affect indices 0..N-1, but it does map indices N+1..$#END to N..$#END-1.) And you should specify the length (1, in this case) of the array to be spliced out.

      e.g. Using your code, with @dirs = qw/A B C/;, and running "touch A/analysis_completed" from another shell.

      $ tree . |-- 503170.pl |-- A |-- B `-- C $ perl -l 503170.pl All done $ tree . |-- 503170.pl |-- A | `-- analysis_completed |-- B `-- C

      When A/analysis_completed exists, your code splices the entire @rFiles array. The following would do the right thing:

      my @dirs = qw[ ... ]; my @rFiles = map{ "$_/analysis_completed" } @dirs; while (@rFiles and sleep 15) { -e $rFiles[$_] and splice @rFiles, $_, 1 for reverse 0..$#rFiles; } print "All done";

        Very good point.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Using flag files to monitor cluster jobs
by chromatic (Archbishop) on Oct 26, 2005 at 22:32 UTC

    If you're working on an operating system that has some kind of file monitoring or file notification system, you can avoid the (admittedly simple) sleep loop by registering interest in file creation or deletion. SGI::FAM is an old module that gives access to SGI's FAM library. (See The Watchful Eye of FAM for more.)

Re: Using flag files to monitor cluster jobs
by GrandFather (Saint) on Oct 26, 2005 at 20:58 UTC

    It may sound a little bizzare on first glance, but you could use email to manage completion processing. If the analysis tasks take a long time and the completion processing doesn't have to be particularly prompt, then using email for signaling can be a quite viable option. :)


    Perl is Huffman encoded by design.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://503164]
Approved by GrandFather
Front-paged by kwaping
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-23 22:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found