Win8 Strawberry 5.8.9.5 (32) Thu 12/17/2020 21:09:41
C:\@Work\Perl\monks
>perl -Mstrict -Mwarnings -MData::Dump=dd
my $s = " info
John
100 - 2000
Kent";
my $word = '';
while ($s =~ m{ $word [^[:alnum:]]+ ([[:alnum:]]+) }xms) {
print "next word after '$word' is '$1' \n";
$word = $1;
}
print "another way \n";
my @words = $s =~ m{ [[:alnum:]]+ }xmsg;
dd \@words;
^Z
next word after '' is 'info'
next word after 'info' is 'John'
next word after 'John' is '100'
next word after '100' is '2000'
next word after '2000' is 'Kent'
another way
["info", "John", 100, 2000, "Kent"]
Update 1: The first method above will fail to capture
'info' if there are no "non-word" (whitespace in this case)
characters before the first word in the string. To capture the first word in
this case, use
$s =~ m{ $word [^[:alnum:]]* ([[:alnum:]]+) }xms
(note * quantifier on [^[:alnum:]]* vice +).
However, using $word as an anchor then fails if there is a
repeated "word" in the string: try replacing "100" with another
instance of "info" and see what happens.
IMHO, "another way" is the better way to strip out "words" from a string.
Update 2: Here's a way to loop through the string
word-by-word regardless of leading/trailing whitespace or repeated
words (but I still prefer stripping/extracting all words to an explicit
or implicit array - the second method above):
Win8 Strawberry 5.8.9.5 (32) Thu 12/17/2020 21:59:18
C:\@Work\Perl\monks
>perl -Mstrict -Mwarnings
my $s = " info
John
info - 2000
Kent ";
my $word;
while ($s =~ m{ [^[:alnum:]]* ([[:alnum:]]+) }xmsg) {
if (defined $word) {
print "next word after '$word' is '$1' \n";
}
else {
print "first word is '$1' \n";
}
$word = $1;
}
^Z
first word is 'info'
next word after 'info' is 'John'
next word after 'John' is 'info'
next word after 'info' is '2000'
next word after '2000' is 'Kent'
Give a man a fish: <%-{-{-{-<