Hi mikegold10,
Tonight pondered over this site and came across your post. It's been a while since I last posted here.
MCE's chunking engine is beneficial for your case. Moreover, MCE::Relay makes it possible to run serially and orderly if needed. Notice how the parallel code looks very much like the serial code. The extra bit is that workers loop over the chunk.
use warnings;
use strict;
use 5.30.0;
use MCE::Loop;
MCE::Loop::init(
max_workers => 4,
init_relay => 1, # enables MCE::Relay feature
);
mce_loop_f {
my ( $mce, $chunk_ref, $chunk_id ) = @_;
my $output = '';
for my $line ( @{ $chunk_ref } ) {
# Skip lines ending with an empty field
next if substr($line, -2) eq "\t\n";
# Remove "\n";
chomp $line;
# Split matching lines into fields on "\t", creating @fields
my @fields = split /\t/, $line;
# Copy only the desired fields from @fields to create a new
# line in TSV format
# This can be done in one simple step in Perl, using
# array slices and the join() function
my $new_line = join "\t", @fields[ 2, 3, 12..18, 25..28, 31 ];
# Append to buffer with newline char
$output .= $new_line . "\n";;
}
# The MCE relay takes a code block and runs serially
# including orderly, one worker at a time. Orderly is
# driven by the chunk_id value behind the scene.
# Thus, must call MCE::relay per each chunk.
MCE::relay {
print $output;
STDOUT->flush;
};
} \*STDIN;
# This signals the workers to exit.
# If omitted, called automatically when the script terminates.
MCE::Loop::finish;
Regards, Mario