Win has asked for the wisdom of the Perl Monks concerning the following question:
Dear Monks,
How do I match the following line to bring me back each figure.
00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,79.61,95.23,,11
+56,126.89,119.29,134.50
I have tried the following:
if ($_ =~ /^(00CH,Gateshead MCD)/){
my $Place = $1;
if ($line =~ /^$Place\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\
+,(\d(\d|\.)*)\,\,*(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.
+)*)\,\,*(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)/) {
# $first = $1; # etc
Re: Regex problem
by gaal (Parson) on May 10, 2007 at 08:50 UTC
|
You may find it easier to split on the comma:
my @data = split /,/;
# How do you validate your data? Do you need to?
warn "bad line", next if @data != 15; # or something...
my @not_numeric = grep { length $_ == 0 || /[^-\d\.]/ } @data[2..$#dat
+a];
warn "fields contain non-numeric data: @not_numeric", next if @not_num
+eric;
(There are better ways to check if something is numeric, this was just quick'n'dirty since I don't know your real requirements.)
Or, if you want to go the CPAN route, Text::CSV::Simple and several other modules exist. | [reply] [d/l] |
Re: Regex problem
by Samy_rio (Vicar) on May 10, 2007 at 08:52 UTC
|
use strict;
use warnings;
while (<DATA>){
if ($_ =~ m/^00CH,Gateshead MCD,([^\n]+)/){
my @pos = split/\,/, $1;
print "Position : $_ \t $pos[$_]\n" for (0..$#pos);
}
print '-' x 90, "\n";
}
__DATA__
00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,79.61,95.23,,11
+56,126.89,119.29,134.50
00CH,Gateshead MCD,69,12.90,158.94,186.87,,537,87.4,7.61,95.23,,116,12
+6.89,19.29,14.50
Regards, Velusamy R. eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';
| [reply] [d/l] [select] |
Re: Regex problem
by johngg (Canon) on May 10, 2007 at 09:19 UTC
|
Using a regex rather than split,
my @values = $line =~ m{(\d+(?:\.\d*)?)(?=\s*,|\z)}g;
seems to do the right thing.
I hope this is of use. Cheers, JohnGG | [reply] [d/l] [select] |
|
The m//g operator, and the split() function, are opposite sides of the same coin. If it's easier to describe the things you want to keep, use a match. If it's easier to describe the things you want to discard, use a split.
m{(\d+(?:\.\d*)?)(?=\s*,|\z)}g
split /,/
Which seems easier to write, read, and maintain?
-- [ e d @ h a l l e y . c c ]
| [reply] [d/l] [select] |
Re: Regex problem
by RL (Monk) on May 10, 2007 at 09:11 UTC
|
my $string = '00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,7
+9.61,95.23,,11
+56,126.89,119.29,134.50
';
my (@ary);
(undef,undef,@ary) = split(/\s*,\s*/, $string);
should take care of whitspaces before or after the comma as well. | [reply] [d/l] |
|
|