Re: What is the most efficient way to split a long string (see body for details/constraints)?

Tonight pondered over this site and came across your post. It's been a while since I last posted here.

MCE's chunking engine is beneficial for your case. Moreover, MCE::Relay makes it possible to run serially and orderly if needed. Notice how the parallel code looks very much like the serial code. The extra bit is that workers loop over the chunk.

use warnings;
use strict;

use 5.30.0;
use MCE::Loop;

MCE::Loop::init(
    max_workers => 4,
    init_relay  => 1,  # enables MCE::Relay feature
);

mce_loop_f {
    my ( $mce, $chunk_ref, $chunk_id ) = @_;
    my $output = '';

    for my $line ( @{ $chunk_ref } ) {
        # Skip lines ending with an empty field
        next if substr($line, -2) eq "\t\n";

        # Remove "\n";
        chomp $line;

        # Split matching lines into fields on "\t", creating @fields
        my @fields = split /\t/, $line;

        # Copy only the desired fields from @fields to create a new
        # line in TSV format
        # This can be done in one simple step in Perl, using
        # array slices and the join() function
        my $new_line = join "\t", @fields[ 2, 3, 12..18, 25..28, 31 ];

        # Append to buffer with newline char
        $output .= $new_line . "\n";;
    }

    # The MCE relay takes a code block and runs serially
    # including orderly, one worker at a time. Orderly is
    # driven by the chunk_id value behind the scene.
    # Thus, must call MCE::relay per each chunk.
    MCE::relay {
        print $output;
        STDOUT->flush;
    };

} \*STDIN;

# This signals the workers to exit.
# If omitted, called automatically when the script terminates.
MCE::Loop::finish;
[download]

Regards, Mario

Comment on Re: What is the most efficient way to split a long string (see body for details/constraints)? Download Code


The stupid question is the question not asked
	PerlMonks