Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: Script exponentially slower as number of files to process increases

by xnous (Sexton)
on Jan 26, 2023 at 09:48 UTC ( [id://11149890] : note . print w/replies, xml ) Need Help??


in reply to Re: Script exponentially slower as number of files to process increases
in thread Script exponentially slower as number of files to process increases

No, it doesn't. It actually makes no difference whether that wait() is there or not. I'm getting the same results with my version, yours or no $forkcount at all. It's very weird.
  • Comment on Re^2: Script exponentially slower as number of files to process increases

Replies are listed 'Best First'.
Re^3: Script exponentially slower as number of files to process increases
by tybalt89 (Monsignor) on Jan 26, 2023 at 10:05 UTC

    Is there a fork limit on your system? You are not checking for fork failures...

      You are not checking for fork failures.

      Worse: xnous does check for fork failures, but way too late:

      if (my $pid = fork) { # $pid defined and !=0 -->parent ++$forkcount; } else { # $pid==0 -->child open my $IN, '<', $infile or exit(0); open my $OUT, '>', "$tempdir/$subdir/text-$i" or exit(0); while (<$IN>) { tr/-!"#%&()*',.\/:;?@\[\\\]_{}><^)(|/ /; # no punct " s/^/ /; s/\n/ \n/; s/[[:digit:]]{1,12}//g; s/w(as|ere)/be/gi; s{$re2}{ $prefix{lc $1} }g; # prefix s{$re3}{ $substring{lc $1} }g; # part s{$re1}{ $whole{lc $1} }g; # whole print $OUT "$_"; } close $OUT; close $IN; defined $pid and exit(0); # $pid==0 -->child, must exit itself }

      If fork() fails, $pid is undef, which is false. So perl will enter the else block, do everything that a child process does, but in the parent process. During that time, the entire child process management (i.e. $forkcount and wait/waitpid) does not happen. The check for failed fork() vs. real child (defined $pid) happens after the child code has run in the parent process. And it lacks any diagnostics.

      When I use fork(), I usually write forking code like this:

      my $pid=fork() // die "Can't fork: $!"; if ($pid) { # parent code } else { # child code }

      Before Perl had the defined-or operator //, I used the following two lines instead of the first one.

      my $pid=fork(); defined($pid) or die "Can't fork: $!";

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re^3: Script exponentially slower as number of files to process increases
by tybalt89 (Monsignor) on Jan 26, 2023 at 09:53 UTC

    What's your ulimit -u for max user processes?

      ulimit -u returns 128011