This will run faster, and IMHO improves upon
split_join() a little ..
sub index_split_join {
return $_[0] unless index $_[0], 'STOCK' >= 0; # do a fast check
+to see if line needs to be looked at
my @tokens = split /\|/, $_[0]; # split into columns
$tokens[15] =~ s/STOCK/BOXXE/; # do replacement in col 16
return join('|',@tokens); # glue back together for final r
+esult
}
For your test of 1 data line, i get:
Rate splitjoin idxsplitjoin simple_regex
splitjoin 50000/s -- -15% -60%
idxsplitjoin 58824/s 18% -- -53%
simple_regex 125000/s 150% 112% --
But that test isn't valid. Presumably (?!?) there are many lines that need to be processed, and only a small percentage have the word 'STOCK' in them (which is where the
index short circuit will excel). Here is a modified benchmark (the DATA is ~1000 lines, all with same # of cols, but only a handful have STOCK in them):
my @lines = <DATA>;
cmpthese(10000, {
idxsplitjoin => sub {index_split_join($_) for @lines},
splitjoin => sub {split_join($_) for @lines},
simple_regex => sub {simple_regex($_) for @lines},
});
# RESULTS:
Benchmark: timing 10000 iterations of idxsplitjoin, simple_regex, spli
+tjoin...
idxsplitjoin: 9 wallclock secs ( 9.16 usr + 0.00 sys = 9.16 CPU) @
+1091.70/s (n=10000)
simple_regex: 11 wallclock secs (10.77 usr + 0.00 sys = 10.77 CPU) @
+928.51/s (n=10000)
splitjoin: 158 wallclock secs (158.15 usr + 0.00 sys = 158.15 CPU) @
+ 63.23/s (n=10000)
Rate splitjoin simple_regex idxsplitjoin
splitjoin 63.2/s -- -93% -94%
simple_regex 929/s 1368% -- -15%
idxsplitjoin 1092/s 1627% 18% --
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.