I have a code snippet which works as desired but it does not look elegant to me. Can someone please suggest a simpler and shorter way of doing it? I am not bothered about performance as this is a one time task on a relatively small data set.
Requirement:
Original String => converted String
j k l foobar => jkl foobar
j k lm foobar => jk lm foobar
jk l foobar => jk l foobar
foobar j k l => foobar j k l
Basically what I am trying to do is strings that have the format of alphabet followed by space (multiple occurrences of this pattern) followed by an optional string should be converted to the format where in the group of alphabets at the begining should be stringified. If the alphabet pattern group occurs at the end this should not happen.
Here is my code snippet:
use strict;
use warnings;
my @str = ("j k l foobar", "foobar", "jkl foobar", "1 2 3",
+ "jk l foobar", "foobar j k l", "foobar j kl", " ", " ", "j
+ jk foobar", "j k jk foobar", "j k l");
my @sanitisedNames = ();
for(@str) {
$_ =~ s/\s+/ /g;
if ($_ =~ /^\s$/) {
next;
}
my $boundary = &sanitise($_);
my $sanitisedName;
if ($boundary == 0) {
$sanitisedName = $_;
} elsif ($boundary == length($_)) {
$_ =~ s/\s+//g;
$sanitisedName = $_;
} else {
my $firstPart = substr($_, 0, $boundary);
$firstPart =~ s/\s+//g;
my $secondPart = substr($_, $boundary);
$sanitisedName = $firstPart.' '.$secondPart;
}
push(@sanitisedNames, $sanitisedName);
}
print $_, "\n" for (@sanitisedNames);
sub sanitise {
my $str = shift;
my @chars = split('', $str);
my $count = 0;
my $len = length($str);
while ($count < $len) {
if ($chars[$count++] ne ' ' && $count < $len && $chars[$count+
++] eq ' ') {
} else {
if ($count == $len) {
return $len;
}
if ($count > 3) {
$count = $count - 3;
return $count;
} else {
return 0;
}
}
}
}
Also can this be done in a single line with a regex? I could not come up with one. So I am coming to the abode of the monks for wisdom :).