combined into a single regex

arcnon has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: combined into a single regex by helphand (Pilgrim) on Dec 27, 2005 at 04:15 UTC
Not entirely sure I understand your question, but assuming you are trying to eliminate the char_test stuff, try this... `my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test'); foreach my $value (@data){ if ( $value =~ /^-?(\d?\|\d+)\.?(\d?\|\d+)$/ ){ print "true\n"; } else{ print "false\n"; } }` [download]	[reply] [d/l]
Re^2: combined into a single regex by Skeeve (Parson) on Dec 27, 2005 at 06:46 UTC
I think, this would be equivalent: `/^-?(\d)\.?(\d)$/` OTOH: I wouldn't consider "-." a legal number. So maybe this is better? `/^-?((\d+\.\d)\|(\.?\d+))$/` This wouldn't capture like yours does, but the captured parts aren't used. So you might want to consider using (?:...)... `s$$([},&%#}/&/]+}%&{});#$&&s&&$^X.($'^"%]=\&(\|?{%` `+`.+=%;.#_}\&"^"-+%).}%:##%}={~=~:.")&e&&s""`$''`"e	[reply] [d/l] [select]
Re^2: combined into a single regex by arcnon (Monk) on Dec 27, 2005 at 04:19 UTC
thanks thats it.	[reply]
Re^2: combined into a single regex by doctor_moron (Scribe) on Dec 27, 2005 at 09:37 UTC
I made a litle change youre regexp and tried to analyze `my @data = (-123.004,-.008,0,-0,.0987,1.1,12345,'d','test'); foreach my $value (@data){ if ( $value =~ /(^-?)(\d?\|\d+)(\.?)(\d?\|\d+)$/){ print "$1 - $2 - $3 - $4 true\n"; } else{ print "false\n"; } }` [download] For this analyze i ignored the process of $1 - first grouping, i think i can understand that, so we straight to second grouping For the $value = 123.004 1. Move on to second group and pick the first alternative (\d?) 2. "123" doesnt match (\d?), since (\d?) = match digit 1 or 0 times 3. Backtrack 1 character. 4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and it match, so we got "123" for $2 5. Move on to the third group, (\.?) match "." 1 or 0 times, so we got "." for $3 6. Move on to the fourth group, first alternative doesnt match so backtrack 1 character, try the second alternative and match "004", so we got "004" for $4. for the $value = 12345 1. Move on to the second group and pick the first alternative (\d?) 2. "12345" doesnt match (\d?) 3. backtrack 1 character 4. Pick the second alternative in the second group (\d+) or match digit 1 or more times, and we got "1" for $2, i am not sure about this, i keep thinking that we should get "12345" for $2 (is there something in third grouping (\.?) ?) 5. Move on to the third group, "2345" doesnt match (\.?) 6. Move on to the fourth group, "2345" doesnt match (\d?), (\d?) only match "2" in "2345". 7. backtrack 1 character, try the second alternative (\d+) and "2345" match (\d+) or match digit 1 or more times. sorry for my english, zak	[reply] [d/l]
Re^3: combined into a single regex by ysth (Canon) on Dec 27, 2005 at 09:59 UTC
One thing you are missing is that (\d?) doesn't fail and backtrack as you describe; it succeeds in matching the "1" in both cases and then applies the rest of the regex `(\.?)(\d?\|\d+)$` to what comes after the "1". Only if that rest of the regex fails will it try first having \d? match 0 digits and applying the rest of the regex to the whole string and then the \d+ alternative, matching first as many digits as possible, then successively fewer until the end of the regex matches. But for "12345", it does almost no backtracking; \d? matches the "1", \.? doesn't match once but succeeds at matching 0 times, the second \d? matches the "2", but the $ doesn't match so \d? tries matching 0 times, $ still doesn't match, so the second \d+ is tried, matching "2345", and then $ matches the end of string. (\d?\|\d+) is a very strange construct; it says to try matching in this order: 1 digit, 0 digits, N digits, N-1 digits, N-2 digits, ..., 2 digits. I can't believe that's really what you want. Do you want something as simple as: <c>/^(-?)(\d+)(\.?)(\d*)$/<c>	[reply] [d/l]
Re^4: combined into a single regex by doctor_moron (Scribe) on Dec 30, 2005 at 13:00 UTC
Re: combined into a single regex by gloryhack (Deacon) on Dec 27, 2005 at 08:50 UTC
My generic "it's a number" regex, the first case you might not want: `my $is_number = qr/^ [+-]? # optional sign ( # then: \d{1,3}(\,\d\d\d)+(\.(\d+)?)? # n,nnn.nn \|\d+\.\d+ # n.n \|\d+\. # n. \|\.\d+ # .n \|\d+ # n ) ([eE][+-]?\d+)? # optional expone +nt $/xo;` [download] Running your data through it: `#!/usr/bin/perl use strict; use warnings; my $regex = qr/^[+-]?(\d{1,3}(\,\d\d\d)+(\.(\d+)?)?\|\d+\.\d+\|\d+\.\|\.\ +d+\|\d+)([eE][+-]?\d+)?$/o; my @data = (-123.004,-.008,0,-0,.0987,1.0,12345,'d','test'); foreach (@data) { if ($_ =~ /$regex/) { print "$_: true\n"; } else { print "$_: false\n"; } } exit;` [download] Yields: `-123.004: true -0.008: true 0: true 0: true 0.0987: true 1: true 12345: true d: false test: false` [download] I've never tried to optimize it, but it seems to work well just the same.	[reply] [d/l] [select]
Re: combined into a single regex by john_oshea (Priest) on Dec 27, 2005 at 10:22 UTC
For completeness, I'd have to mention Regexp::Common, in particular Regexp::Common::number, which allows you to use expressions like the following (lifted shamelessly from the 'DESCRIPTION'): $RE{num}{int}{-base}{-sep}{-group}{-places} $RE{num}{real}{-base}{-radix}{-places}{-sep}{-group}{-expon} $RE{num}{dec}{-radix}{-places}{-sep}{-group}{-expon} $RE{num}{oct}{-radix}{-places}{-sep}{-group}{-expon} $RE{num}{bin}{-radix}{-places}{-sep}{-group}{-expon} $RE{num}{hex}{-radix}{-places}{-sep}{-group}{-expon} $RE{num}{decimal}{-base}{-radix}{-places}{-sep}{-group} $RE{num}{square} $RE{num}{roman}	[reply]
Re: combined into a single regex by doctor_moron (Scribe) on Dec 27, 2005 at 08:12 UTC
`my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test') +; foreach my $value (@data){ if($value =~ /(\d.)/) { print "$1 is true\n"; } else{ print "false\n"; } }` [download] Update I add another values to @data, so @data become : `my @data = (-123.004,-.008,0,-0,.0987,1.1,1.0,1.0001,12345,'d','test', +'d5','5d', '0d5', 'd0d');` [download] and /(\d.)/ wont work for this @data Solved it with /^(-?)(\d+)(\.?)(\d*)$/ Have a look at skeeve's \|\| ysth's code	[reply] [d/l] [select]
Re: combined into a single regex by borisz (Canon) on Dec 27, 2005 at 08:57 UTC
What about using Scalar::Util. `use Scalar::Util qw/looks_like_number/; my @data = ( -123.004, -.008, 0, -0, .0987, 1.0, 12345, 'd', 'test' ); foreach my $value (@data) { print( looks_like_number($value) ? "true\n" : "false\n" ); } __OUTPUT__ true true true true true true true false false` [download] Boris	[reply] [d/l]


Just another Perl shrine
	PerlMonks