|Welcome to the Monastery|
Re: Influencing control flow using a signal handlerby afoken (Canon)
|on Nov 03, 2016 at 07:42 UTC||Need Help??|
I don't see a handler for SIGCHLD in your code. Unix sends the parent process a SIGCHLD when a child process has exited, no matter how (clean exit, crash, unhandled signal). The parent process should call wait or waitpid to get the exit reason (clean, crash, signal), the exit code, and the PID of the exited process; this is commonly called reaping. See perlipc for details.
The usual way to handle several childs is to keep track of all forked, unreaped child processes as a hash using PIDs as keys. Hash values are not relevant, can be used for application-specific purposes. When you fork a new process, store its PID in the hash. A handler for SIGCHLD calls waitpid(-1, WNOHANG) until it returns a non-positive value. A positive return value is the PID of an exited child process, the exit information is available in $?. Remove that PID from the hash, maybe store the exit information elsewhere.
To keep a fixed number of child processes running, make the parent process count the elements in the hash, and fork new child processes until the desired number has been reached. sleep for a long time if there is nothing to do. Any signal will interrupt sleep, so when sleep returns, either a long time has passed or at least one child process has exited. Alternatively, call wait and remove the returned PID from the hash.
To kill all children when one child has exited (why?), do something very similar: Change the SIGCHLD handler so that it first removes all exited processes from the hash, then - still from within the handler - kill all remaining processes from the hash. This will cause several new SIGCHLDs, so after some time, the hash will be empty. Then, and only then, restart a new set of child processes from the main program.
1. Note that Unix does not send the parent process one SIGCHLD per exited child process. SIGCHLD is sent if at least one child process has exited. In english: SIGCHLD does not mean "a child process has exited", it means "at least one of the child processes has exited". That's why you need a loop around waitpid in the SIGCHLD handler.
2. die( "waitpid: $!" ) unless $! eq '' looks wrong. Comparing $! a.k.a. $ERRNO to the empty string is suspicious, I would use $! in boolean context instead. And
3. kill( 15, $pid ) has a magic number. Yes, any sane Unix assigns SIGTERM to the ID 15. But it's hard to remember. Linux has 64 signals. Why should I learn all that numbers by heart? Just use the signal name: kill(TERM => $pid) or kill('TERM',$pid).
Fixed a typo in 2. - thanks, kcott
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)