Update: Added regex example.
Hello all :) I tried trimming the left side of the string. Plus incorporated the memfh example by davido. I'm curious too.
Split improves ~ 2x faster for this demonstration with Perl 5.22.x and later releases. That is the case on macOS.
use strict;
use warnings;
use Time::HiRes 'time';
my $huge_string = "aaa bbb\nccc ddd\neee fff\nggg hhh\niii jjj\nkkk ll
+l\nmmm nnn\n";
# concatenate string exponentially to 917,504 lines
$huge_string .= $huge_string for 1..17;
# memfh
{
my $string = $huge_string;
my $start = time;
open my $memfh, '<', \$string;
my @lines = <$memfh>;
close $memfh;
printf "duration memfh: %0.3f seconds\n", time - $start;
printf "%d lines\n\n", scalar(@lines);
}
# regex
{
my $string = $huge_string;
my $start = time;
my @lines;
while ( $string =~ /([^\n]+\n)/mg ) {
my $line = $1; # save $1 to not lose the value
push @lines, $line;
}
printf "duration regex: %0.3f seconds\n", time - $start;
printf "%d lines\n\n", scalar(@lines);
}
# split
{
my $string = $huge_string;
my $start = time;
my @lines = split(/\n/, $string);
printf "duration split: %0.3f seconds\n", time - $start;
printf "%d lines\n\n", scalar(@lines);
}
# trim
{
my $string = $huge_string;
my $start = time;
my @lines;
while ( my $line = substr($string, 0, index($string, "\n") + 1, ''
+) ) {
push @lines, $line;
}
printf "duration trim : %0.3f seconds\n", time - $start;
printf "%d lines\n\n", scalar(@lines);
}
Output - Perl 5.28.2
duration memfh: 0.384 seconds
917504 lines
duration regex: 0.387 seconds
917504 lines
duration split: 0.067 seconds
917504 lines
duration trim : 0.201 seconds
917504 lines
Another machine - Perl 5.26.1
duration memfh: 0.477 seconds
917504 lines
duration regex: 0.445 seconds
917504 lines
duration split: 0.065 seconds
917504 lines
duration trim : 0.259 seconds
917504 lines
Same machine - Perl 5.18.2
duration memfh: 0.530 seconds
917504 lines
duration regex: 0.490 seconds
917504 lines
duration split: 0.130 seconds
917504 lines
duration trim : 0.261 seconds
917504 lines
Regards, Mario
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.