Re: regular expression help
by gellyfish (Monsignor) on Jul 26, 2005 at 19:25 UTC
|
Personally, I would do a simple regular expression to split the date string into year, month and date and then use one of the the popular Date handling modules to do the validation - a regular expression to get the correct days for the months is going to be monstrous if not impossible.
/J\
| [reply] |
Re: regular expression help
by Tanktalus (Canon) on Jul 26, 2005 at 19:25 UTC
|
I think the simple, straight-forward answer is: don't. Use a regular expression to pull apart the data, and do minor syntax checking, and use other date-related modules to do semantic checking.
| [reply] |
|
Here's a solution that uses the core module Time::Local
use Time::Local qw( timegm );
/(\d{4})[-\/.](\d{2})[-\/.](\d{2})/
or die("Bad format\n");
my $time = eval { timegm(0, 0, 0, $3, $2, $1) };
die("Bad date\n") if $@;
Switch timegm for timelocal if you prefer. | [reply] [d/l] [select] |
Re: regular expression help
by kwaping (Priest) on Jul 26, 2005 at 19:40 UTC
|
I highly recommend exploring Date::Calc, that module is really great for this kind of thing. | [reply] |
Re: regular expression help
by Codon (Friar) on Jul 26, 2005 at 20:35 UTC
|
You're trying to make a regex do the work of a subroutine. The way to do leap day checks requires division checks on the year, but you only need to worry about that if the month February and the day is 29.
Date::Calc (mentioned above) has a check_date() function that can do exactly what you are looking for (provided you split the date into year, month and day first).
Ivan Heffner
Sr. Software Engineer, DAS Lead
WhitePages.com, Inc.
| [reply] |
Re: regular expression help
by ChrisR (Hermit) on Jul 26, 2005 at 20:38 UTC
|
There are many ways to do it but here is a pretty simple and self explanatory one:
use strict;
use warnings;
use Date::Calc qw(check_date);
my $date = "9999/12/11";
my($year,$month,$day) = $date =~ /(\d+)\/(\d+)\/(\d+)/;
print "$year/$month/$day is ";
if(check_date($year,$month,$day) && $year >=1753 && $year <= 9999)
{
print "valid";
}
else
{
print "not valid\n";
}
I don't think you are going to be able to do a complete validation using just a regex.
Chris | [reply] [d/l] |
Re: regular expression help
by AReed (Pilgrim) on Jul 26, 2005 at 20:16 UTC
|
Unless using regexes is a requirement, I wouldn't use them at all for this purpose. I'd use "split" to separate the date string into its component parts and then validate each component separately.That has the added benefit of being easier to understand when you take a look at this code again a month from now.
| [reply] |
Re: regular expression help
by mikeraz (Friar) on Jul 26, 2005 at 21:46 UTC
|
I'm going to echo don't. To start with, as other's have pointed out RE is the wrong tool for this job and modules such as Date::Calc are there to do it for you. But then threre's the issue of your "simple" regular expression that is broken. I'm surprised no one pointed it out...
((17((5[3-9])|([6-9]\d)))|((18|19)\d\d)|([2-9]\d\d\d))[-/.](0[1-9]|1[0
+12])[- /.](0[1-9]|[12][0-9]|3[01])
^ unescaped /
+terminates the RE if you use it in /((17...)/
but even if you encapsulate it in a variable there's still problems:
#!/usr/bin/perl
@sampdata = qw (
894/7/14
1752/8/12
1753/12/24
1957/8/30
3998/4/22
9999/3/15
10000/1/1
);
$re = "((17((5[3-9])|([6-9]\d)))|((18|19)\d\d)|([2-9]\d\d\d))[-/.](0[1
+-9]|1[012])[- /.](0[1-9]|
[12][0-9]|3[01])";
while (<@sampdata>) {
print;
print ( /$re/ ? " is " : " is not " );
print " in range 1753 to 9999\n";
}
__END__
894/7/14 is not in range 1753 to 9999
1752/8/12 is not in range 1753 to 9999
1753/12/24 is in range 1753 to 9999
1957/8/30 is not in range 1753 to 9999
3998/4/22 is not in range 1753 to 9999
9999/3/15 is not in range 1753 to 9999
10000/1/1 is not in range 1753 to 9999
So a change of
((17((5[3-9])|([6-9]\d)))
to
((17((5[3-9])|(1([6-9]\d))))
Seems to be in order.
Re: the two instances of [-/.], was the second one supposed to include
+ a space?
What seems simple today will be a headache to verify as correct in the future when you're trying to find a real bug.
Be Appropriate && Follow Your Curiosity
| [reply] [d/l] |
Re: regular expression help
by puploki (Hermit) on Jul 26, 2005 at 19:41 UTC
|
I was originally going to reply going "oh, there's this fabulous module that contains regexps for all sorts of common stuff called Regexp::Common", but it doesn't do dates - yet!
There's another good resource called the regular expressions library - try this listing for some examples. | [reply] |
|
| [reply] |
|
| [reply] |
Re: regular expression help
by Adam (Vicar) on Jul 27, 2005 at 14:01 UTC
|
You should listen to the other monks who directed you to modules. But I wanted to take a stab at doing it in a regex. This code seems to work:
#!perl -w
use strict;
for my $y ( 1753 .. 9999 )
{
for my $m ( 1 .. 12 )
{
for my $d ( 1 .. 31 )
{
my $date = sprintf "%04d/%02d/%02d", $y, $m, $d;
if ( $date !~
m/^
########################
# Year
([2-9]\d{3}|1[89]\d\d|17[6-9]\d|175[3-9])
\/ #####################
# Month
(0[1-9]|1[0-2])
\/ #####################
# Day
(
0[1-9]|1\d|2[0-8]| # 01 - 28
(?<=(?:0[13578]|10|12)\/)(?:29|3[01])| # to 31
(?<=(?:0[469]|11)\/)(?:29|30)| # to 30
(?<=(?:
(?:2[048]|3[26]|4[048]|5[26]|6[048]|7[26]|8[048]|9[26])00|
\d\d(?:0[48]|1[26]|2[048]|3[26]|4[048]|5[26]|6[048]|7[26]|8[048]|9
+[26])\/02\/)
)(?:29) # Leap year
)
########################
$/x
)
{
print "$date is invalid\n";
}
# Else $1 == year, $2 == month, $3 == day
}}}
Of course, different countries switched to the Gregorian calendar at different dates, so you really need a module to get it right. My favorite tome on the topic is "Calendrical Calculations" by Edward M Reingold and Nachum Dershowitz.
Update: I realized that I made a mistake listing leap-years. I've now fixed that, but it further demonstrates why a module is better. | [reply] [d/l] |