Re^2: MCE segmentation fault

in reply to Re: MCE segmentation fault
in thread MCE segmentation fault

Ah... I see about 4.5 busy cores when running serially. Meaning that Imager itself is involving multiple cores behind the scene. Well then, let's capture the compute time using Time::HiRes and increase from 9,999 to 99,999 iterations. Plus capture the compute time on a 32 core AMD 3970x processor with SMT disabled to better understand the benefit of chunking.

Here is the updated chunking demonstration to capture the compute time.

#!/usr/bin/perl

use strict;
use warnings;
use Imager;
use MCE::Loop;
use Time::HiRes 'time';

STDOUT->autoflush;

my $start = time;
my $count = 0;
my @data;

MCE::Loop->init(
  max_workers => MCE::Util::get_ncpu(),
  chunk_size  => 100,
  init_relay  => '',
  gather      => sub {
    if (@_ == 1) {
      print "\r", $count++;
    } else {
      push @data, @{ $_[1] };
    }
  }
);

mce_loop {
  my ($mce, $chunk_ref, $chunk_id) = @_;
  my @i_data;

  for my $x (@{ $chunk_ref }) {
    my $i = Imager->new(xsize=>120, ysize=>50)
    or die  Imager->errstr;

    $i->string(
      text  => $x,
      color => Imager::Color->new('ffffff'),
      font  => Imager::Font->new(
        file  => '/usr/share/fonts/truetype/msttcorefonts/cour.ttf',
      # file  => '/System/Library/Fonts/Courier.dfont',
      # face  => 'Courier New', # mswin
        size  => 42,
        aa    => 1),
      x     => 5,
      y     => 35
    );

    # One cannot serialize Imager objects or will crash.
    # Instead save the image to a scalar and send that.
    # The manager process later reads from scalar refs.

    $i->write(data => \my $data, type => 'gif');
    push @i_data, $data;

    MCE->gather($x);
  }

  MCE::relay { MCE->gather($chunk_id, \@i_data) };

} [ 0 .. 99_999 ];

MCE::Loop->finish;

print " frame GIF done!\n";
printf "compute time: %0.3fs\n", time - $start;

Imager->write_multi({
  file => 'gif.gif', type => 'gif', gif_loop => 0, gif_delay => 1
}, map { Imager->read_multi(data => \$_) } @data) or die Imager->errst
+r;

printf "Total: %0.3fs\n", time - $start;
[download]

I captured the compute time for 99,999 iterations. The total time includes writing the GIF file.

          Compute      Total

  Serial  28.685s  1m12.844s
Parallel   9.492s    22.588s
Chunking   3.772s    16.902s  SMT Disabled
Chunking   2.764s    15.982s  SMT Enabled
[download]

Relay is called orderly by chunk_id behind the scene. It involves workers waiting their turn to run inside the relay CODE block. Chunking is a way to reduce the IPC overhead whenever a single item takes little time to compute. Thereby seeing all 32 cores at 100% CPU utilization.

Running serially consumes 4.5 cores from what I can tell (i.e. Imager itself consumes more than 1 core). Chunking (compute time) is 7 times faster. That explains why not faster; 4.5 * 7 = 31.5 which is the number of cores the box I tested on.

Today, I learned that Imager or the lib C code runs parallel behind the scene.

Regards, Mario

In Section Seekers of Perl Wisdom