Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Regex problem

by Win (Novice)
on May 10, 2007 at 08:39 UTC ( [id://614568]=perlquestion: print w/replies, xml ) Need Help??

Win has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

How do I match the following line to bring me back each figure.
00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,79.61,95.23,,11 +56,126.89,119.29,134.50
I have tried the following:
if ($_ =~ /^(00CH,Gateshead MCD)/){ my $Place = $1; if ($line =~ /^$Place\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\ +,(\d(\d|\.)*)\,\,*(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\. +)*)\,\,*(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)\,(\d(\d|\.)*)/) { # $first = $1; # etc

Replies are listed 'Best First'.
Re: Regex problem
by gaal (Parson) on May 10, 2007 at 08:50 UTC
    You may find it easier to split on the comma:

    my @data = split /,/; # How do you validate your data? Do you need to? warn "bad line", next if @data != 15; # or something... my @not_numeric = grep { length $_ == 0 || /[^-\d\.]/ } @data[2..$#dat +a]; warn "fields contain non-numeric data: @not_numeric", next if @not_num +eric;

    (There are better ways to check if something is numeric, this was just quick'n'dirty since I don't know your real requirements.)

    Or, if you want to go the CPAN route, Text::CSV::Simple and several other modules exist.

Re: Regex problem
by Samy_rio (Vicar) on May 10, 2007 at 08:52 UTC

    Hi, try like this,

    use strict; use warnings; while (<DATA>){ if ($_ =~ m/^00CH,Gateshead MCD,([^\n]+)/){ my @pos = split/\,/, $1; print "Position : $_ \t $pos[$_]\n" for (0..$#pos); } print '-' x 90, "\n"; } __DATA__ 00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,79.61,95.23,,11 +56,126.89,119.29,134.50 00CH,Gateshead MCD,69,12.90,158.94,186.87,,537,87.4,7.61,95.23,,116,12 +6.89,19.29,14.50

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

Re: Regex problem
by johngg (Canon) on May 10, 2007 at 09:19 UTC
    Using a regex rather than split,

    my @values = $line =~ m{(\d+(?:\.\d*)?)(?=\s*,|\z)}g;

    seems to do the right thing.

    I hope this is of use.

    Cheers,

    JohnGG

      The m//g operator, and the split() function, are opposite sides of the same coin. If it's easier to describe the things you want to keep, use a match. If it's easier to describe the things you want to discard, use a split.
      m{(\d+(?:\.\d*)?)(?=\s*,|\z)}g
      split /,/
      Which seems easier to write, read, and maintain?

      --
      [ e d @ h a l l e y . c c ]

Re: Regex problem
by RL (Monk) on May 10, 2007 at 09:11 UTC
    my $string = '00CH,Gateshead MCD,619,172.90,158.94,186.87,,537,87.42,7 +9.61,95.23,,11 +56,126.89,119.29,134.50 '; my (@ary); (undef,undef,@ary) = split(/\s*,\s*/, $string);

    should take care of whitspaces before or after the comma as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://614568]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-20 00:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found