Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

combined into a single regex

by arcnon (Monk)
on Dec 27, 2005 at 03:48 UTC ( [id://519243]=perlquestion: print w/replies, xml ) Need Help??

arcnon has asked for the wisdom of the Perl Monks concerning the following question:

I realize that there is Test::Numeric but I can't get it to install.

Unforunately I can't seem to get [^\.\-] to work into a single regex. Here is what I am tring to accomplish.

my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test'); foreach my $value (@data){ $value = char_test($value); if ( $value =~/-?(\d?|\d+)\.?(\d?|\d+)/ && $value ne 'error' ){ print "true\n"; } else{ print "false\n"; } } sub char_test{ my $value = shift; my @chars = $value =~/\D/g; foreach my $char (@chars){ if ($char ne '-' && $char ne '.'){ return 'error'; } } return $value; } OUTPUT: true true true true true true true false false

Replies are listed 'Best First'.
Re: combined into a single regex
by helphand (Pilgrim) on Dec 27, 2005 at 04:15 UTC

    Not entirely sure I understand your question, but assuming you are trying to eliminate the char_test stuff, try this...

    my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test'); foreach my $value (@data){ if ( $value =~ /^-?(\d?|\d+)\.?(\d?|\d+)$/ ){ print "true\n"; } else{ print "false\n"; } }

      I think, this would be equivalent:
      /^-?(\d*)\.?(\d*)$/
      OTOH: I wouldn't consider "-." a legal number.

      So maybe this is better?
      /^-?((\d+\.\d*)|(\.?\d+))$/
      This wouldn't capture like yours does, but the captured parts aren't used. So you might want to consider using (?:...)...


      s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
      +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
      thanks thats it.

      I made a litle change youre regexp and tried to analyze

      my @data = (-123.004,-.008,0,-0,.0987,1.1,12345,'d','test'); foreach my $value (@data){ if ( $value =~ /(^-?)(\d?|\d+)(\.?)(\d?|\d+)$/){ print "$1 - $2 - $3 - $4 true\n"; } else{ print "false\n"; } }

      For this analyze i ignored the process of $1 - first grouping, i think i can understand that, so we straight to second grouping

      For the $value = 123.004
      1. Move on to second group and pick the first alternative (\d?)
      2. "123" doesnt match (\d?), since (\d?) = match digit 1 or 0 times
      3. Backtrack 1 character.
      4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and it match, so we got "123" for $2
      5. Move on to the third group, (\.?) match "." 1 or 0 times, so we got "." for $3
      6. Move on to the fourth group, first alternative doesnt match so backtrack 1 character, try the second alternative and match "004", so we got "004" for $4.

      for the $value = 12345
      1. Move on to the second group and pick the first alternative (\d?)
      2. "12345" doesnt match (\d?)
      3. backtrack 1 character
      4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and we got "1" for $2, i am not sure about this, i keep thinking that we should get "12345" for $2 (is there something in third grouping (\.?) ?)
      5. Move on to the third group, "2345" doesnt match (\.?)
      6. Move on to the fourth group, "2345" doesnt match (\d?), (\d?) only match "2" in "2345".
      7. backtrack 1 character, try the second alternative (\d+) and "2345" match (\d+) or match digit 1 or more times.

      sorry for my english, zak

        One thing you are missing is that (\d?) doesn't fail and backtrack as you describe; it succeeds in matching the "1" in both cases and then applies the rest of the regex (\.?)(\d?|\d+)$ to what comes after the "1". Only if that rest of the regex fails will it try first having \d? match 0 digits and applying the rest of the regex to the whole string and then the \d+ alternative, matching first as many digits as possible, then successively fewer until the end of the regex matches.

        But for "12345", it does almost no backtracking; \d? matches the "1", \.? doesn't match once but succeeds at matching 0 times, the second \d? matches the "2", but the $ doesn't match so \d? tries matching 0 times, $ still doesn't match, so the second \d+ is tried, matching "2345", and then $ matches the end of string.

        (\d?|\d+) is a very strange construct; it says to try matching in this order: 1 digit, 0 digits, N digits, N-1 digits, N-2 digits, ..., 2 digits. I can't believe that's really what you want. Do you want something as simple as: <c>/^(-?)(\d+)(\.?)(\d*)$/<c>

Re: combined into a single regex
by gloryhack (Deacon) on Dec 27, 2005 at 08:50 UTC
    My generic "it's a number" regex, the first case you might not want:
    my $is_number = qr/^ [+-]? # optional sign ( # then: \d{1,3}(\,\d\d\d)+(\.(\d+)?)? # n,nnn.nn |\d+\.\d+ # n.n |\d+\. # n. |\.\d+ # .n |\d+ # n ) ([eE][+-]?\d+)? # optional expone +nt $/xo;
    Running your data through it:
    #!/usr/bin/perl use strict; use warnings; my $regex = qr/^[+-]?(\d{1,3}(\,\d\d\d)+(\.(\d+)?)?|\d+\.\d+|\d+\.|\.\ +d+|\d+)([eE][+-]?\d+)?$/o; my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test'); foreach (@data) { if ($_ =~ /$regex/) { print "$_: true\n"; } else { print "$_: false\n"; } } exit;
    Yields:
    -123.004: true -0.008: true 0: true 0: true 0.0987: true 1: true 12345: true d: false test: false
    I've never tried to optimize it, but it seems to work well just the same.
Re: combined into a single regex
by john_oshea (Priest) on Dec 27, 2005 at 10:22 UTC

    For completeness, I'd have to mention Regexp::Common, in particular Regexp::Common::number, which allows you to use expressions like the following (lifted shamelessly from the 'DESCRIPTION'):

    • $RE{num}{int}{-base}{-sep}{-group}{-places}
    • $RE{num}{real}{-base}{-radix}{-places}{-sep}{-group}{-expon}
    • $RE{num}{dec}{-radix}{-places}{-sep}{-group}{-expon}
    • $RE{num}{oct}{-radix}{-places}{-sep}{-group}{-expon}
    • $RE{num}{bin}{-radix}{-places}{-sep}{-group}{-expon}
    • $RE{num}{hex}{-radix}{-places}{-sep}{-group}{-expon}
    • $RE{num}{decimal}{-base}{-radix}{-places}{-sep}{-group}
    • $RE{num}{square}
    • $RE{num}{roman}
Re: combined into a single regex
by doctor_moron (Scribe) on Dec 27, 2005 at 08:12 UTC
    my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test') +; foreach my $value (@data){ if($value =~ /(\d.*)/) { print "$1 is true\n"; } else{ print "false\n"; } }

    Update

    I add another values to @data, so @data become :

    my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test', +'d5','5d', '0d5', 'd0d');

    and /(\d.*)/ wont work for this @data

    Solved it with /^(-?)(\d+)(\.?)(\d*)$/

    Have a look at skeeve's || ysth's code

Re: combined into a single regex
by borisz (Canon) on Dec 27, 2005 at 08:57 UTC
    What about using Scalar::Util.
    use Scalar::Util qw/looks_like_number/; my @data = ( -123.004, -.008, 0, -0, .0987, 1.0, 12345, 'd', 'test' ); foreach my $value (@data) { print( looks_like_number($value) ? "true\n" : "false\n" ); } __OUTPUT__ true true true true true true true false false
    Boris

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://519243]
Approved by shenme
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (7)
As of 2024-04-19 08:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found