Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

fork()ing a large process

by Jonathan (Curate)
on Nov 29, 2001 at 21:42 UTC ( [id://128415]=perlquestion: print w/replies, xml ) Need Help??

Jonathan has asked for the wisdom of the Perl Monks concerning the following question:

I have a process that polls for files and forks a child process to handle each file as it arrives.
The problem is that sometimes it waits for the child to exit before moving on to the next This seems to happen when processing the larger files.

Below is the bare bones of the processing with the error checking removed.

What am i doing wrong? Would it be better to do a double fork()?
while (1) { # Loop other processing here. load_file($this_file); reap_children(); } sub load_file() { my $childProcess; $childProcess = fork(); unless($childProcess) { # Child process so lets exec the loader. exec("loader $this_file"); exit 0; } } sub reap_children() { my $kid = 1; while ($kid > 0) { $kid = waitpid(-1, 1); } }
I'm running
perl -V Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration: Platform: osname=solaris, osvers=2.6, archname=sun4-solaris uname='sunos lonxpr1732 5.6 generic_105181-10 sun4m sparc sunw,spa +rcstation-5 ' hint=recommended, useposix=true, d_sigaction=define usethreads=undef useperlio=undef d_sfio=undef

Replies are listed 'Best First'.
Re: fork()ing a large process
by maverick (Curate) on Nov 29, 2001 at 22:04 UTC
    actually from your code it seems that the parent process will ALWAYS wait for the child process to exit before going to the next file. Your main while loop forks a child (load_file) and then waits for it to exit (reap_children) before going to the next iteration of the while loop.

    Try this instead (untested)

    use strict; # I'm hinting that this should be part of the (error chec +king you took out) my @pids; while (1) { # Loop other processing here. push(@pids,load_file($this_file)); } reap_children(@pids); sub load_file() { my $childProcess; $childProcess = fork(); unless($childProcess) { # Child process so lets exec the loader. exec("loader $this_file"); exit 0; } return $childProcess; } sub reap_children() { foreach(@_) { wait($_); } }
    spawn all the children keeping tack of their pids as you spawn them...then just wait in turn for all the children to die at the end.

    HTH

    Update: I should post before being caffeinated...the waitpid shouldn't block as tye pointed out to me. So, I'm not quite sure where it's going wrong....maybe a solaris issue? I've used the above technique before and not had any problems...

    /\/\averick
    perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"

      Thanks for the response maverick. I know waitpid shouldn't block which was why I included my OS and Perl build details. As for use strict, yes it and 1000 other lines were removed.
Re: fork()ing a large process
by dws (Chancellor) on Nov 29, 2001 at 22:20 UTC
    I have a process that polls for files and forks a child process to handle each file as it arrives. The problem is that sometimes it waits for the child to exit before moving on to the next. This seems to happen when processing the larger files.

    For the proper way to reap children, consult perlman:perlipc. The section on "Signals" is near the top of the document, and describes how to set up a SIGCHLD handler to correctly account for the death of a child process.

    If you don't handle SIGCHLD correctly, you could be getting blocked on wait().

      Actually, using a signal handler to reap children is problematic in Perl because Perl signal handlers aren't 100% reliable. Each time one is trigger, you have something like a 2% chance of corrupting Perl's internal structures. I think this may be fixed in a an upcoming version of Perl.

      The safest way to reap children is shown under "perldoc -f waitpid" and closely matches the code in question. Perhaps Solaris doesn't define WNOHANG as 1? That seems unlikely, but that is all I've come up with so far.

              - tye (but my friends call me "Tye")
        Thanks for looking at it tye. I hoped it might have been something to do with my Perl version. I'm stuck.
Re: fork()ing a large process
by traveler (Parson) on Nov 29, 2001 at 22:39 UTC
    I'm not sure what reap_children is supposed to do. What it does is reap any dead children. Your args to waitpid say "Check to see if any processes are dead, but don't hang this process waiting." You also don't seem to be doing anything with $kid since if more than one process dies, you ignore the pid of the first one. Follow maverick's advice if you care when they are all dead. If you are just trying to avoid zombies, do  $SIG{CHLD} = sub {wait}; It really depends on what you want to do.

    Now, why does your process appear to wait for some children to die? Well, one possibility is that you have exceeded the maximum number of children your OS allows. That is, if you have a lot of files to process, you may have forked as many children as you are allowed to fork. To figure that out you can keep track of the number of live children and see if that number gets big. The limit on my system is 511.

    BTW you can avoid the magic numbers in waitpid if you use POSIX "sys_wait_h";.

    HTH, --traveler

    Update: I'd never heard about the 2% chance of corruption. I've always followed the Camel's advice about using the signal handler, I guess I've been lucky. Maybe one choice is to do a periodic wait for children using code similar to reap_children. You might also try setting $SIG{CHLD} to SIG_IGN.

      Note that $SIG{CHLD}= sub {wait}; has another problem. If two children exit at nearly the same time, the two SIGCHLD signals can arrive close enough together that the signal handler only gets called once. Each time this happens, you'd get one more zombie hanging around.

      Back to the original problem, I'd add some debug print statements so that you can figure out exactly where the process is hanging.

      Also, $SIG{CHLD}= 'IGNORE'; only works on some operating systems (SysV-based ones, as I recall).

              - tye (but my friends call me "Tye")
        You are correct that the wait should be in a loop such as the reap_children function has. I was clearly not thinking clearly... Also, on SYS V systems IIRC CHLD or CLD signals are regenerated if you do not do wait on the clild so the race condidion you mention may not hold true there and wait by itself should work. I suppose I've been using SYS V-based systems too much to think of some of these issues. My bad.

        tye is correct. To be portable you'll need to find another solution. I have used non-blocking waits in a timer loop to clean up children and that may work here. Of course, if you don't care about portability, use what works on your system and document that it isn't portable.

Re: fork()ing a large process
by Fastolfe (Vicar) on Nov 30, 2001 at 00:28 UTC
Re: fork()ing a large process
by Dogma (Pilgrim) on Nov 30, 2001 at 14:17 UTC
    If you really don't care about the process returning just set $SIG{CHLD} = 'IGNORE' and ignore the issue of reaping all together. The other option is to do a NON blocking call to waitpid.
    use POSIX qw(:sys_wait_h); sub REAPER { while ((my $pid = waitpid(-1,WNOHANG)) > 0) { # do something with $pid; } $SIG{CHLD} = \&REAPER; } $SIG{CHLD} = \&REAPER;
    Then you don't have to worry about your code blocking while waiting for children to return. However this doesn't mean you should do something like.
    sub fork_bomb { while (1){ fork_a_kid(); } }
    Speaking as a system administrator this won't earn you alot of good sysadmin karma. It is ok to fork many children just remember to limit the number of children on the system at any one time. Please read perlipc and perlfork they helped me out big time.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://128415]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2024-04-25 05:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found