Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

running many small, similar scripts

by qq (Hermit)
on Jun 13, 2004 at 22:32 UTC ( [id://366367]=perlquestion: print w/replies, xml ) Need Help??

qq has asked for the wisdom of the Perl Monks concerning the following question:

I've got about 20ish small scripts that need to be run. More are likely to be created.

They look like:

#!/usr/bin/perl use Some::Module; my $obj = Some::Module->new( %options, code => \&code ); $obj->do_some_work; sub code { ... }; # not always

Mostly its just the arguments to the constructor that differ, but occaisionally a coderef is passed in.

So these scripts all need to get run one after another. I've got something like:

my @files = glob( 'run_me/*' ); while ( my $file = shift @files ) { my $cmd = "perl -Ilib $file >> log"; print STDERR "$cmd\n"; system( $cmd ) == 0 or die "something wrong: $?\n"; }

This is fine, except that I've got a feeling that it could be prettier.

Should I cat all the small scripts together, thus saving the cost of starting perl for each, and the use Some::Module time? (They also each open a database connection and write to it, which could be done just once.) Should I cut down the small scripts to make them just be config files (maybe yes, but I'm not sure I've time left, or how to represent a couple of 'special cases').

Is there something else I've overlooked?

Thanks, qq

Replies are listed 'Best First'.
Re: running many small, similar scripts
by graff (Chancellor) on Jun 13, 2004 at 23:54 UTC
    You're right, it's fine. If it works, I'd be disinclined to fix it. If there were a lot more than 20 or so scripts in the "run_me" directory, I might be inclined to save myself the overhead of launching that many sub-shells to run that many processes -- something like this is an easy patch (if you have a unix-like system):
    open SH, "| /bin/sh" or die "can't launch subshell: $!"; for my $file ( @files ) { my $cmd = "perl -Ilib $file >> log\n"; print STDERR $cmd; print SH $cmd; # stderr from subshell will go to STDERR } close SH;
    update: Since all the processes are appending their stdout to the same log file, you could leave the ">> log" out of the sub-shell command lines, and run the main script like this (assuming a bourne-like shell that allows separate redirection of stdout and stderr):
    $ run-em-all.pl > log 2> errlog
    another update: Actually, the for loop shown above will tend to get it's own STDERR output mixed up wrong with any STDERR coming from the subshell. By default, output to SH will be buffered, and you're most likely to see all 20 or so command lines listed first, then if any of them caused problems, their STDERR output will come afterwards. Just imposing autoflush on SH will not solve the problem -- in fact, it could make it worse, since both the subshell and the main perl script could end up writining to STDERR at the same time -- what a mess.

    Either don't echo the command lines to STDERR, or else log them this way (again, assuming a bourne-like shell):

    $cmd = "echo running $file 1>&2; perl -Ilib $file" print SH $cmd;
    This way, the subshell itself prints a message to stderr before running each little script file.

      ++graff. I always have trouble controlling stdin and stdout simultaneously.

      I've actually gone ahead and cut the scripts down to config files - using do to read them. So far this seems to work very well - I doubt the speed savings are really worthwhile, but its satisfying to cut out the code duplication.

      qq

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://366367]
Approved by DaWolf
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2024-04-19 14:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found