http://qs321.pair.com?node_id=694746


in reply to Trimming whitespaces methods

My preference is to do the space trimming in two stages as it seems to be faster than either the capture or alternation methods.

use strict; use warnings; use Benchmark q{cmpthese}; my @arr = ( q{ fdsgehw fw wwfe w } ) x 5000; cmpthese( -5, { alternation => sub { my @new = @arr; s{ ^\s* | \s*$ }{}gx for @new; }, capture => sub { my @new = @arr; s{ ^\s* (\S.*?) \s*$ }{$1}x for @new; }, twoStage => sub { my @new = @arr; s{ ^\s* }{}x for @new; s{ \s*$ }{}x for @new; }, }, );

The results.

Rate capture alternation twoStage capture 8.96/s -- -26% -50% alternation 12.2/s 36% -- -33% twoStage 18.1/s 102% 48% --

I hope this is of interest.

Cheers,

JohnGG

Update: Fixed code indentation problems caused by TABs

Replies are listed 'Best First'.
Re^2: Trimming whitespaces methods
by lodin (Hermit) on Jun 30, 2008 at 17:02 UTC

    In order for the code to be truly equivalent the s modifier should be used on the substitution or a newline may break it.

    $_ = " foo\nbar "; s{ ^\s* (\S.*?) \s*$ }{$1}x; print "<$_>"; __END__ < foo bar >
    I assume you added the \S in the pattern as an improvement, but it should perhaps be noted that it has the effect of leaving a line of only whitespaces untouched, whereas the other ways don't.

    lodin

      I assume you added the \S in the pattern as an improvement

      No, I think I must have put it in because I wasn't thinking straight :-(

      Well spotted!

      Cheers,

      JohnGG