http://qs321.pair.com?node_id=11113670


in reply to MCE segmentation fault

Hi Anonymous Monk,

Unfortunately, one cannot serialize Imager objects or will crash. This is true for Storable and Sereal::Encoder/Decoder. The thing to do is to save the image into a scalar variable before sending to the manager process.

Testing was done using Ubuntu 18.04.

$ sudo apt update $ sudo apt install libimager-perl libmce-perl $ sudo apt install libsereal-encoder-perl libsereal-decoder-perl $ sudo apt install ttf-mscorefonts-installer

MCE::Map demonstration

Here is the working parallel demonstration via MCE::Map.
#!/usr/bin/perl use strict; use warnings; use Imager; use MCE::Map; STDOUT->autoflush; MCE::Map->init( max_workers => 6, chunk_size => 1, init_relay => '' ); my @data = mce_map { my $x = $_; my $i = Imager->new(xsize=>120, ysize=>50) or die Imager->errstr; $i->string( text => $x, color => Imager::Color->new('ffffff'), font => Imager::Font->new( file => '/usr/share/fonts/truetype/msttcorefonts/cour.ttf', # file => '/System/Library/Fonts/Courier.dfont', # face => 'Courier New', # mswin size => 42, aa => 1), x => 5, y => 35 ); # One cannot serialize Imager objects or will crash. # Instead save the image to a scalar and send that. # The manager process later reads from scalar refs. $i->write(data => \my $data, type => 'gif'); MCE::relay { print "\r$x" }; $data; } [ 0 .. 9999 ]; MCE::Map->finish; Imager->write_multi({ file => 'gif.gif', type => 'gif', gif_loop => 0, gif_delay => 1 }, map { Imager->read_multi(data => \$_) } @data) or die Imager->errst +r; print " frame GIF done!\n";

MCE::Loop demonstration

Here is another way using MCE::Loop. I call MCE gather inside the relay block.
#!/usr/bin/perl use strict; use warnings; use Imager; use MCE::Loop; STDOUT->autoflush; my @data; MCE::Loop->init( max_workers => 6, chunk_size => 1, init_relay => '', gather => sub { push @data, $_[1]; print "\r$_[0]"; } ); mce_loop { my $x = $_; my $i = Imager->new(xsize=>120, ysize=>50) or die Imager->errstr; $i->string( text => $x, color => Imager::Color->new('ffffff'), font => Imager::Font->new( file => '/usr/share/fonts/truetype/msttcorefonts/cour.ttf', # file => '/System/Library/Fonts/Courier.dfont', # face => 'Courier New', # mswin size => 42, aa => 1), x => 5, y => 35 ); # One cannot serialize Imager objects or will crash. # Instead save the image to a scalar and send that. # The manager process later reads from scalar refs. $i->write(data => \my $data, type => 'gif'); MCE::relay { MCE->gather($x, $data) }; } [ 0 .. 9999 ]; MCE::Loop->finish; Imager->write_multi({ file => 'gif.gif', type => 'gif', gif_loop => 0, gif_delay => 1 }, map { Imager->read_multi(data => \$_) } @data) or die Imager->errst +r; print " frame GIF done!\n";

Chunking demonstration for HEDT systems

Reducing IPC overhead (i.e. relay) is possible simply by chunking.
#!/usr/bin/perl use strict; use warnings; use Imager; use MCE::Loop; STDOUT->autoflush; my $count = 0; my @data; MCE::Loop->init( max_workers => MCE::Util::get_ncpu(), chunk_size => 100, init_relay => '', gather => sub { if (@_ == 1) { print "\r", $count++; } else { push @data, @{ $_[1] }; } } ); mce_loop { my ($mce, $chunk_ref, $chunk_id) = @_; my @i_data; for my $x (@{ $chunk_ref }) { my $i = Imager->new(xsize=>120, ysize=>50) or die Imager->errstr; $i->string( text => $x, color => Imager::Color->new('ffffff'), font => Imager::Font->new( file => '/usr/share/fonts/truetype/msttcorefonts/cour.ttf', # file => '/System/Library/Fonts/Courier.dfont', # face => 'Courier New', # mswin size => 42, aa => 1), x => 5, y => 35 ); # One cannot serialize Imager objects or will crash. # Instead save the image to a scalar and send that. # The manager process later reads from scalar refs. $i->write(data => \my $data, type => 'gif'); push @i_data, $data; MCE->gather($x); } MCE::relay { MCE->gather($chunk_id, \@i_data) }; } [ 0 .. 9999 ]; MCE::Loop->finish; Imager->write_multi({ file => 'gif.gif', type => 'gif', gif_loop => 0, gif_delay => 1 }, map { Imager->read_multi(data => \$_) } @data) or die Imager->errst +r; print " frame GIF done!\n";

This has been rather interesting. There is an overall speedup, but not what I expected. Well, here are the various ways to run parallel. One may call gather multiple times inside a MCE code block. However, MCE::relay must be one time only. Not more or less.

Regards, Mario

Replies are listed 'Best First'.
Re^2: MCE segmentation fault
by marioroy (Prior) on Mar 03, 2020 at 06:59 UTC

    Ah... I see about 4.5 busy cores when running serially. Meaning that Imager itself is involving multiple cores behind the scene. Well then, let's capture the compute time using Time::HiRes and increase from 9,999 to 99,999 iterations. Plus capture the compute time on a 32 core AMD 3970x processor with SMT disabled to better understand the benefit of chunking.

    Here is the updated chunking demonstration to capture the compute time.

    #!/usr/bin/perl use strict; use warnings; use Imager; use MCE::Loop; use Time::HiRes 'time'; STDOUT->autoflush; my $start = time; my $count = 0; my @data; MCE::Loop->init( max_workers => MCE::Util::get_ncpu(), chunk_size => 100, init_relay => '', gather => sub { if (@_ == 1) { print "\r", $count++; } else { push @data, @{ $_[1] }; } } ); mce_loop { my ($mce, $chunk_ref, $chunk_id) = @_; my @i_data; for my $x (@{ $chunk_ref }) { my $i = Imager->new(xsize=>120, ysize=>50) or die Imager->errstr; $i->string( text => $x, color => Imager::Color->new('ffffff'), font => Imager::Font->new( file => '/usr/share/fonts/truetype/msttcorefonts/cour.ttf', # file => '/System/Library/Fonts/Courier.dfont', # face => 'Courier New', # mswin size => 42, aa => 1), x => 5, y => 35 ); # One cannot serialize Imager objects or will crash. # Instead save the image to a scalar and send that. # The manager process later reads from scalar refs. $i->write(data => \my $data, type => 'gif'); push @i_data, $data; MCE->gather($x); } MCE::relay { MCE->gather($chunk_id, \@i_data) }; } [ 0 .. 99_999 ]; MCE::Loop->finish; print " frame GIF done!\n"; printf "compute time: %0.3fs\n", time - $start; Imager->write_multi({ file => 'gif.gif', type => 'gif', gif_loop => 0, gif_delay => 1 }, map { Imager->read_multi(data => \$_) } @data) or die Imager->errst +r; printf "Total: %0.3fs\n", time - $start;

    I captured the compute time for 99,999 iterations. The total time includes writing the GIF file.

    Compute Total Serial 28.685s 1m12.844s Parallel 9.492s 22.588s Chunking 3.772s 16.902s SMT Disabled Chunking 2.764s 15.982s SMT Enabled

    Relay is called orderly by chunk_id behind the scene. It involves workers waiting their turn to run inside the relay CODE block. Chunking is a way to reduce the IPC overhead whenever a single item takes little time to compute. Thereby seeing all 32 cores at 100% CPU utilization.

    Running serially consumes 4.5 cores from what I can tell (i.e. Imager itself consumes more than 1 core). Chunking (compute time) is 7 times faster. That explains why not faster; 4.5 * 7 = 31.5 which is the number of cores the box I tested on.

    Today, I learned that Imager or the lib C code runs parallel behind the scene.

    Regards, Mario

Re^2: MCE segmentation fault
by Anonymous Monk on Mar 03, 2020 at 07:25 UTC
    This has been rather interesting. There is an overall speedup, but not what I expected.

    Thank you Mario. I was trying to use the Imager object because of failing to implement the scalar solution (for dumb reasons, like omitting type => 'gif' on the write method), as you showed in the heatmap node, with errors like:

    write_multi: image 1 is not an Imager image object
    or
    Usage: i_writegif_wiol(IO,hashref, images...)
    
    I've only had time to try your first demo and of course it works wonders! The speed gain is more impressive if another 9 is added to the count to make a 100,000 frame GIF (with 8 workers on 3.1GHz i7):
    real	3m43.591s # Perl
    real	1m19.701s # MCE
    
    MCE also appears to use much less memory. The plain Perl version grows to almost 2GB, while the MCE workers consume a mere 6MB each until the final process that consolidates the GIF only grows to 1GB. This makes an 80MB GIF. I tried a million frames too but the heat throttled my laptop CPU to 2 GHz while it used 9GB of RAM before swapping so that took 18 minutes to make an 820MB GIF, no fault of MCE...

    MCE makes the hard things easy—thanks for supercharging Perl!

      Sorry, just realized I compared apples to oranges. The Perl version that used 2GB was collecting Imager objects in the array, while MCE was using scalars. When changed to collect scalars plain Perl used the same amount of RAM as MCE, about 1GB, but then it was even slower than MCE:
      real	4m50.307s Perl
      real	5m24.204s Perl
      real	5m26.991s Perl
      
      real	1m16.036s MCE
      real	1m29.944s MCE
      real	1m28.977s MCE
      
      For some reason my CPU frequency is different when comparing the plain Perl version to MCE. MCE sends it straight to 3GHz while the plain Perl hangs out around 2-2.5GHz. I don't know why that happens but it must contribute to the difference...

        Wow :) MCE continues to boggle my mind after all these years.