Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: How do I remove whitespace at the beginning or end of my string?

by trizen (Hermit)
on Jan 29, 2012 at 14:33 UTC ( [id://950578]=note: print w/replies, xml ) Need Help??


in reply to How do I remove whitespace at the beginning or end of my string?

There are some faster solutions which sometimes can be really slow, depending on how many whitespaces a string contain.

If a string contains a lot of whitespaces.
Example: my $str = q{    }. q{a b c d e f g h i j} x 200 . q{    };

MRE book suggests this code:
$str =~ s/^\s+((?:.+\S)?)\s+$/$1/s;

I admit, I was surprised how fast it is compared with: "s/^\s+//" and his brother "s/\s+$//". They can't even compete at a benchmark, they are too slow with the above example! (that's because of the second regex which match at the end of the string, if fails so many times if string contains a lot of whitespaces (see re 'debug')).

Another approach (I know is silly, but is faster in some casses):
$str =~ s/^\s+//; $str = reverse($str); $str =~ s/^\s+//; $str = reverse($str);
Benchmark using the above example:
's_reverse' 42017/s -- -12% -48% 'unpack_A' 47847/s 14% -- -41% 'MRE_regx' 80645/s 92% 69% --

Replies are listed 'Best First'.
Re: Answer: How do I remove whitespace at the beginning or end of my string?
by repellent (Priest) on Jan 30, 2012 at 00:08 UTC
    MRE_regx does not trim whitespace as expected:
    $ perl -de 1 Loading DB routines from perl5db.pl version 1.3 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 1 DB<1> $str = ' x '; $str =~ s/^\s+((?:.+\S)?)\s+$/$1/s; DB<2> x $str 0 ' x'
        use Test::More; sub trim { my $s = $_[0]; $s =~ s/^\s+(\S?.*\S)\s+$/$1/s; $s } is( trim(' '), '' ); is( trim('a '), 'a' ); is( trim(' a'), 'a' ); is( trim(' a '), 'a' ); is( trim('ab '), 'ab' ); is( trim(' ab'), 'ab' ); is( trim(' ab '), 'ab' ); is( trim('a bb c '), 'a bb c' ); is( trim(' a bb c'), 'a bb c' ); is( trim(' a bb c '), 'a bb c' ); done_testing(); __END__ not ok 1 # Failed test at ./t.pl line 12. # got: ' ' # expected: '' not ok 2 # Failed test at ./t.pl line 13. # got: 'a ' # expected: 'a' not ok 3 # Failed test at ./t.pl line 14. # got: ' a' # expected: 'a' ok 4 not ok 5 # Failed test at ./t.pl line 16. # got: 'ab ' # expected: 'ab' not ok 6 # Failed test at ./t.pl line 17. # got: ' ab' # expected: 'ab' ok 7 not ok 8 # Failed test at ./t.pl line 19. # got: 'a bb c ' # expected: 'a bb c' not ok 9 # Failed test at ./t.pl line 20. # got: ' a bb c' # expected: 'a bb c' ok 10 1..10 # Looks like you failed 7 tests of 10.

        The one I could find with best benchmark and passes tests is s/^\s*((?:.*\S)?)\s*$/$1/s;, which is essentially like MRE_regx with + replaced with * (perhaps trizen typo-ed?)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://950578]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-19 19:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found