s/(?<=[[:alpha:]])\s+(?=[[:alpha:]])/ /g;
Will have the same effect as your posted expression. Although it will replace a single space character between two letters with a single space character (essentially a no-op), because it is a simpler pattern will actually take less time to execute. Below I am posting a simple benchmark I used to test this theory. If I have missed something please enlighten me.
#!/usr/local/bin/perl -w
use strict;
use Benchmark;
my(@tests) = <DATA>;
timethese(400, {his => sub{ s/(?<=[[:alpha:]])(?:\s\s+|[^\S ]+)(?=[[:a
+lpha:]])/ /g foreach(@tests); },
mine => sub{s/(?<=[[:alpha:]])\s+(?=[[:alpha:]])/ /g f
+oreach(@tests); }});
__DATA__
A sting with weird
A string without weird
Another variety with more wierd
Anotherthingwithnospaces
something odd
soemthing normal
[me@mylinux]$ ./space.pl
Benchmark: timing 400 iterations of his, mine...
his: 0 wallclock secs ( 0.07 usr + 0.00 sys = 0.07 CPU) @ 57
+14.29/s (n=400)
(warning: too few iterations for a reliable count)
mine: 0 wallclock secs ( 0.03 usr + 0.00 sys = 0.03 CPU) @ 13
+333.33/s (n=400)
(warning: too few iterations for a reliable count)
They say that time changes things, but you actually have to change them yourself. Andy Warhol
|