Task distribution project Q

feloniousMonk has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks

I have been given a task which involves maximizing my computing resources using minimal overhead. For this task, I have a single text file which is to be split into many (up to thousands) of equal-sized chunks, and a single command to be run on them. The command is a small perl script which analyzes the text file's data, and outputs some good stuff into another file (which is an argument to the script)

Now, up until now I have passed the jobs off to Sun Grid Engine as a job array, and life has been good. In this case I cannot do this, but must build my own job manager instead. Here's why:

SGE will not tell me when a job is complete, whether it worked, failed, etc. This in and of itself is not a deal-breaker because I already have handlers built into my SGE-calling code.

SGE overhead - I don't think this should be a consideration, but my boss does not want it either way.

The main point here is that the system must be complete and can be run on networks which do not have SGE or any similar system. That's the biggie.

The quick overview is I have a big file and a script to run it through. I need to make a handler script which breaks it up, throws the individual jobs at a bunch of big servers, gets told when it's done then cats all the output files into one big result file.

My question is this - without reinventing the wheel, does anyone have advice to lead me in the right direction? The closest I've come to a starting point is using RPC calls, not sure if this is the best idea or not. Also, a little more about the system - there will be a main script which will be started on a compute server. Each compute server will be given N jobs at one time, size of N depending on the size of the input file chunks. It is possible but maybe sub-optimal to start one server program on each compute server per job it can handle at any given time. Maybe this can even be done without breaking it into a client/server architecture though I don't yet see how.

Sorry for the wordy description, I will be happy to clarify anything that I can.

-- Thanks, feloniousMonk

Comment on Task distribution project Q

Replies are listed 'Best First'.
Re: Task distribution project Q by Aristotle (Chancellor) on Sep 20, 2004 at 20:35 UTC
The simplest thing that can possibly work in your case is to use dsh. split(1) will chop the file up for you. If your needs are sufficiently simple you could do this without Perl on the master machine at all. Makeshifts last the longest.	[reply]
Re^2: Task distribution project Q by feloniousMonk (Pilgrim) on Sep 20, 2004 at 20:42 UTC
Thanks Aristotle. Actually I have no problem splitting the file per se, just sending out the jobs and getting responses is the tough part. Never had to deal with this because I had SGE at my disposal...	[reply]
Re^3: Task distribution project Q by Aristotle (Chancellor) on Sep 20, 2004 at 20:53 UTC
I know. I'm just saying that this can all be done in just sh if your response processing is not so complex as to be painful in it. Makeshifts last the longest.	[reply]
Re: Task distribution project Q by perrin (Chancellor) on Sep 20, 2004 at 21:27 UTC
If you're willing to install some software, Spread::Queue can help with this.	[reply]


Just another Perl shrine
	PerlMonks