arcnon has asked for the wisdom of the Perl Monks concerning the following question:
I realize that there is Test::Numeric but I can't get it to install. Unforunately I can't seem to get [^\.\-] to work into a single regex. Here is what I am tring to accomplish.
my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test');
foreach my $value (@data){
$value = char_test($value);
if ( $value =~/-?(\d?|\d+)\.?(\d?|\d+)/ && $value ne 'error' ){
print "true\n";
}
else{
print "false\n";
}
}
sub char_test{
my $value = shift;
my @chars = $value =~/\D/g;
foreach my $char (@chars){
if ($char ne '-' && $char ne '.'){
return 'error';
}
}
return $value;
}
OUTPUT:
true
true
true
true
true
true
true
false
false
Re: combined into a single regex
by helphand (Pilgrim) on Dec 27, 2005 at 04:15 UTC
|
Not entirely sure I understand your question, but assuming you are trying to eliminate the char_test stuff, try this...
my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test');
foreach my $value (@data){
if ( $value =~ /^-?(\d?|\d+)\.?(\d?|\d+)$/ ){
print "true\n";
}
else{
print "false\n";
}
}
| [reply] [d/l] |
|
I think, this would be equivalent:
/^-?(\d*)\.?(\d*)$/
OTOH: I wouldn't consider "-." a legal number.
So maybe this is better?
/^-?((\d+\.\d*)|(\.?\d+))$/
This wouldn't capture like yours does, but the captured parts aren't used. So you might want to consider using (?:...)...
s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
| [reply] [d/l] [select] |
|
| [reply] |
|
my @data = (-123.004,-.008,0,-0,.0987,1.1,12345,'d','test');
foreach my $value (@data){
if ( $value =~ /(^-?)(\d?|\d+)(\.?)(\d?|\d+)$/){
print "$1 - $2 - $3 - $4 true\n";
}
else{
print "false\n";
}
}
For this analyze i ignored the process of $1 - first grouping, i think i can understand that, so we straight to second grouping
For the $value = 123.004
1. Move on to second group and pick the first alternative (\d?)
2. "123" doesnt match (\d?), since (\d?) = match digit 1 or 0 times
3. Backtrack 1 character.
4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and it match, so we got "123" for $2
5. Move on to the third group, (\.?) match "." 1 or 0 times, so we got "." for $3
6. Move on to the fourth group, first alternative doesnt match so backtrack 1 character, try the second alternative and match "004", so we got "004" for $4.
for the $value = 12345
1. Move on to the second group and pick the first alternative (\d?)
2. "12345" doesnt match (\d?)
3. backtrack 1 character
4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and we got "1" for $2, i am not sure about this, i keep thinking that we should get "12345" for $2 (is there something in third grouping (\.?) ?)
5. Move on to the third group, "2345" doesnt match (\.?)
6. Move on to the fourth group, "2345" doesnt match (\d?), (\d?) only match "2" in "2345".
7. backtrack 1 character, try the second alternative (\d+) and "2345" match (\d+) or match digit 1 or more times.
sorry for my english, zak
| [reply] [d/l] |
|
One thing you are missing is that (\d?) doesn't fail and backtrack as you describe; it succeeds in matching the "1" in both cases and then applies the rest of the regex (\.?)(\d?|\d+)$ to what comes after the "1".
Only if that rest of the regex fails will it try first having \d? match 0 digits and applying the rest of the regex to the whole string and then the \d+ alternative, matching first as many digits as possible, then successively fewer until the end of the regex matches.
But for "12345", it does almost no backtracking; \d? matches the "1",
\.? doesn't match once but succeeds at matching 0 times, the second \d? matches the "2", but the $ doesn't match so \d? tries matching 0 times, $ still doesn't match, so the second \d+ is tried, matching "2345", and then
$ matches the end of string.
(\d?|\d+) is a very strange construct; it says to try matching in this order:
1 digit, 0 digits, N digits, N-1 digits, N-2 digits, ..., 2 digits. I can't believe that's really what you want. Do you want something as simple as: <c>/^(-?)(\d+)(\.?)(\d*)$/<c>
| [reply] [d/l] |
|
Re: combined into a single regex
by gloryhack (Deacon) on Dec 27, 2005 at 08:50 UTC
|
My generic "it's a number" regex, the first case you might not want:
my $is_number = qr/^
[+-]? # optional sign
( # then:
\d{1,3}(\,\d\d\d)+(\.(\d+)?)? # n,nnn.nn
|\d+\.\d+ # n.n
|\d+\. # n.
|\.\d+ # .n
|\d+ # n
)
([eE][+-]?\d+)? # optional expone
+nt
$/xo;
Running your data through it:
#!/usr/bin/perl
use strict;
use warnings;
my $regex = qr/^[+-]?(\d{1,3}(\,\d\d\d)+(\.(\d+)?)?|\d+\.\d+|\d+\.|\.\
+d+|\d+)([eE][+-]?\d+)?$/o;
my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test');
foreach (@data) {
if ($_ =~ /$regex/) {
print "$_: true\n";
} else {
print "$_: false\n";
}
}
exit;
Yields:
-123.004: true
-0.008: true
0: true
0: true
0.0987: true
1: true
12345: true
d: false
test: false
I've never tried to optimize it, but it seems to work well just the same. | [reply] [d/l] [select] |
Re: combined into a single regex
by john_oshea (Priest) on Dec 27, 2005 at 10:22 UTC
|
For completeness, I'd have to mention Regexp::Common, in particular Regexp::Common::number, which allows you to use expressions like the following (lifted shamelessly from the 'DESCRIPTION'):
- $RE{num}{int}{-base}{-sep}{-group}{-places}
- $RE{num}{real}{-base}{-radix}{-places}{-sep}{-group}{-expon}
- $RE{num}{dec}{-radix}{-places}{-sep}{-group}{-expon}
- $RE{num}{oct}{-radix}{-places}{-sep}{-group}{-expon}
- $RE{num}{bin}{-radix}{-places}{-sep}{-group}{-expon}
- $RE{num}{hex}{-radix}{-places}{-sep}{-group}{-expon}
- $RE{num}{decimal}{-base}{-radix}{-places}{-sep}{-group}
- $RE{num}{square}
- $RE{num}{roman}
| [reply] |
Re: combined into a single regex
by doctor_moron (Scribe) on Dec 27, 2005 at 08:12 UTC
|
my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test')
+;
foreach my $value (@data){
if($value =~ /(\d.*)/) {
print "$1 is true\n";
}
else{
print "false\n";
}
}
Update
I add another values to @data, so @data become :
my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test',
+'d5','5d', '0d5', 'd0d');
and /(\d.*)/ wont work for this @data
Solved it with /^(-?)(\d+)(\.?)(\d*)$/
Have a look at skeeve's || ysth's code
| [reply] [d/l] [select] |
Re: combined into a single regex
by borisz (Canon) on Dec 27, 2005 at 08:57 UTC
|
use Scalar::Util qw/looks_like_number/;
my @data = ( -123.004, -.008, 0, -0, .0987, 1.0, 12345, 'd', 'test' );
foreach my $value (@data) {
print( looks_like_number($value) ? "true\n" : "false\n" );
}
__OUTPUT__
true
true
true
true
true
true
true
false
false
| [reply] [d/l] |
|
|