Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: Script exponentially slower as number of files to process increases

by marioroy (Prior)
on Jan 28, 2023 at 05:24 UTC ( [id://11149966]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Script exponentially slower as number of files to process increases
in thread Script exponentially slower as number of files to process increases

the results are very interesting...

The kikuchiyou.pl script not involving IPC should be first. It's a great solution. If memory is plentiful, why not consume a little memory for input data before spawning.

The tests ran in a loop utilizing 8, 16, 32, etc up to 4096 threads...

It's interesting seeing one attempting 4,000+ workers for a use-case that is somewhat CPU-bound. On another note, what coolness witnessing the operating system coping with this. For example, the threads and MCE::Child solutions involve IPC; e.g. workers entering a critical section -- who's next to read input from the queue or channel.

On my system, the kikuchiyo.pl script exits early running 512 workers. I modified the script to figure why that is.

--- kikuchiyo1.pl 2023-01-27 23:31:34.592261566 -0600 +++ kikuchiyo2.pl 2023-01-27 23:31:12.488762580 -0600 @@ -92,12 +92,16 @@ for my $worker_id (0..$maxforks-1) { if (my $pid = fork) { ++$forkcount; + } elsif (!defined $pid) { + warn "fork failed for worker_id $worker_id\n"; } else { for my $i (0..$#{$batched_data[$worker_id]}) { my $infile = $batched_data[$worker_id][$i]; my $subdir = $worker_id + 1; - open my $IN, '<', $infile or exit(0); - open my $OUT, '>', "$tempdir/$subdir/text-$i" or exit(0); + open my $IN, '<', $infile + or die "[$worker_id] open error: infile"; + open my $OUT, '>', "$tempdir/$subdir/text-$i" + or die "[$worker_id] open error: outfile\n"; while (<$IN>) { tr/-!"#%&()*',.\/:;?@\[\\\]”_“{’}><^)(|/ /; # no punc +t " s/^/ /;

Possibly a ulimit issue -n issue. My open-files ulimit is 1024. Edit: It has nothing to do with ulimit as the number of open files ulimit is per process. Workers 256+ exit early.

[256] open error: outfile [257] open error: outfile [258] open error: outfile ... [511] open error: outfile

The threads and MCE::Child solutions pass for 512 workers. Again, regex is used here -- kinda CPU bound. Is running more workers than the number of logical CPU cores improving performance? There are golden CPU samples out there.

Replies are listed 'Best First'.
Re^4: Script exponentially slower as number of files to process increases
by xnous (Sexton) on Jan 28, 2023 at 11:28 UTC
    First of all, thank you for your explanations and the work you put in suggesting an alternative.

    Is running more workers than the number of logical CPU cores improving performance? There are golden CPU samples out there.

    I don't know what to say, other than try to benignly bypass the perlmonks' filter and show somehow the graphs (if a janitor would be kind enough to URLify these, thanks). It does seem that in certain scenarios the first part of your statement holds true.

  • First test, https://i.imgur.com/CpclI9L.png - 3457 files
  • Second test, https://i.imgur.com/cDi41fC.png 34570 files
  • Third test, https://i.imgur.com/yNZokCx.png - 345700 files, due to long run times, I omitted the clearly slower Threads::Queue solution.
  • Final test, https://i.imgur.com/2NVovHx.png - same load as in #3, but only for fork() which proved to be the fastest, across the 512-4096 range in 128-step increments, trying to find the sweet spot.
  • You need to copy/paste the links by hand, but it's worth the trouble. The gain of fork() from 256 to 512 processes is almost unbelievable, while the performance of the other implementations is practically linear.

    EDIT: But of course it is, it's due to workers exiting early.

    I also tested your updated script but it showed no tangible improvement on my setup.

      Certainly, there's an anomaly. Do you know what is causing the beautiful script to suddenly leap from 110 seconds down to 50 seconds? Unfortunately, half of the workers exit due to open file error. They go unnoticed due to no warning messages.

      Applying the changes here may enlighten you as to why. Another verification is running ls -l data.dat. Is the file size smaller than expected?

        I would also suggest checking the operating system logs. Depending on OPs setup, the system (or some security software) may throttle or slow down the ability to fork new processes if it thinks there is something strange going on. (Similar to how init may prevent daemons to restart too often in a given timeframe).

        PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
        You could run the script with strace -o /tmp/trace -ff -e trace=%file script.pl to see why the file opens fail.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11149966]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-03-29 15:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found