confusing fork/readline behaviour

flipper has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: confusing fork/readline behaviour by afoken (Chancellor) on Aug 13, 2015 at 17:49 UTC
Can someone explain what is going on please? Both processes share the same ~~filehandle~~ file descriptor, plus the libc usually reads ahead. If /etc/passwd is small enough, the libc slurps the entire file in one process during the first getline(), leaving nothing for the other process. Quoting from the linux man page of fork(2): The child inherits copies of the parent's set of open file descriptors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. This means that the two descriptors share open file status flags, current file offset, and signal-driven I/O attributes (see the description of F_SETOWN and F_SETSIG in fcntl(2)). Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]
Re^2: confusing fork/readline behaviour by flipper (Beadle) on Aug 13, 2015 at 18:52 UTC
Ah that makes sense, thanks. Presumably my script above is unsafe in the general case - I imagine libc could readahead 4k and not finish on a line boundary, then the second process would call readline and start halfway through a line...	[reply]
Re^3: confusing fork/readline behaviour by afoken (Chancellor) on Aug 14, 2015 at 18:22 UTC
Presumably my script above is unsafe in the general case - I imagine libc could readahead 4k and not finish on a line boundary, then the second process would call readline and start halfway through a line. Correct. Why do you want two processes to read the same file? Alexander -- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)	[reply]
Re: confusing fork/readline behaviour by KurtSchwind (Chaplain) on Aug 13, 2015 at 18:19 UTC
<quote>as they don't share a file pointer </quote> Actually for reading they do share the same file descriptor. Try moving that open after the fork(). `perl -e ' fork(); open X, "</etc/passwd"; while (readline(X)){ print "$$: $_"; sleep 1 } print "$$: all done\n" '` [download] -- “For the Present is the point at which time touches eternity.” - CS Lewis	[reply] [d/l]
Re: confusing fork/readline behaviour by anonymized user 468275 (Curate) on Aug 14, 2015 at 11:39 UTC
(Updated:) In addition, parents should wait for children before exiting, otherwise children ~~either get killed or~~ usually stop running and become zombie processes. `perl -e ' my $pid = fork(); open X, "</etc/passwd"; while (readline(X)){ print "$$: $_"; sleep 1 } print "$$: all done\n"; waitpid $pid, 0 if $pid; '` [download] One world, one people	[reply] [d/l]
Re^2: confusing fork/readline behaviour (wait) by tye (Sage) on Aug 14, 2015 at 14:18 UTC
In addition, parents should wait for children before exiting Yes, that is a good general practice. The most common problem of not doing that is that the shell that launched the command waits for the parent to exit, then displays the next prompt, then the child outputs a bit more, making a confusing display. Things get worse if a child might be reading from the same STDIN. And even if the parent process wasn't launched as a command from an interactive shell, it is often good to not have the parent exit before the children; for example, you often don't want to restart some service or daemon when some children from the last instance are still hanging around. otherwise children either get killed or stop running and become zombie processes But neither of those are valid justifications for that practice. The parent exiting doesn't kill the child. The closest thing to that is that the login process exiting will send SIGHUP to all processes that share that controlling tty. (And your Perl script is pretty darn unlikely to be a login process.) When a child process exits, it becomes a zombie process until its parent waits for it, or until the parent process exits (because then the child gets inherited by process 1 which scrupulously wait()s for any expired children). So preventing zombies is a good reason to wait() for children, but the one time that it doesn't matter is right before the parent process exits. - tye	[reply]
Re^3: confusing fork/readline behaviour (wait) by anonymized user 468275 (Curate) on Aug 14, 2015 at 18:55 UTC
Yes I agree - the kill case is rather exceptional, so for clarity I'll strike through it. The case where the child doesn't zombify because the parent exits immediately is also exceptional, as you say. One world, one people	[reply]
Re^2: confusing fork/readline behaviour by roboticus (Chancellor) on Aug 14, 2015 at 14:29 UTC
Not completely true. Using fork is a natural way to perform processes in the background. `$ cat forkit.pl #!/usr/bin/env perl use strict; use warnings; my $pid = fork(); if ($pid) { print "Parent is exiting now!\n"; } else { print "Child is waiting a bit\n"; sleep 15; print "Child is done!\n"; } $ perl forkit.pl Parent is exiting now! Child is waiting a bit $ date Fri Aug 14 10:22:06 EDT 2015 $ date Fri Aug 14 10:22:15 EDT 2015 $ Child is done!` [download] There are differences on different operating systems, to be sure. Some processes could be killed on some operating systems (though I've not experienced it myself). You can accumulate zombie processes if you don't take steps to avoid it. If you're going to use fork, you need to educate yourself on what it does and doesn't do on your platform. ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply] [d/l]
Re^3: confusing fork/readline behaviour by anonymized user 468275 (Curate) on Aug 14, 2015 at 19:01 UTC
Yes that's true. Although it is more common that the parent has more work to do, it is also a well-known way to create a daemon by having a main program fork and exit immediately, leaving the child running detached. One world, one people	[reply]


We don't bite newbies here... much
	PerlMonks