Dear Monks, After I've studied some date parsers I was not very impressed. They were complex and often didn't accept to many different input dates (like fraction of seconds often were not allowed!)
Anyway, I'm not a very good programmer (beginner) so I would like to have some comments on the piece of code I wrote (some monks already helped me with the regexp!). Maybe I re-invented the wheel, but it was a good experience to deal with dates.......
This module accepts a few different formatted input dates, like: 2005-10-21 12:09:22.099
2005-10-21 12:09:22
2005-10-21
2005-240 12:09:22.099
2005240 12:09:22
1141128000 # epoch
etc
Here is the module TimeFormat.pm that I wrote:
#! /usr/bin/perl
package TimeFormat ;
use strict ;
use warnings ;
require Exporter ;
our ($VERSION, $ISA, @EXPORT) ;
our @ISA = qw(Exporter);
@EXPORT = qw( sec2date date2sec ) ;
$VERSION = 1.0 ;
our @month_day = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31) ;
# This function accepts epoch-seconds and returns a formatted date.
sub sec2date {
my ( $self, $time, $format) = @_ ;
if ( ref ( $self ) !~ /TimeFormat/ ) {
$format = $time ;
$time = $self ;
} else {
$format = $self->{format} if ( ! defined $format ) ;
}
$format = "%Y-%m-%d %H:%M:%S" if ( @_ == 1 && ! defined $format )
+; # no format defined, use default
my $mtime = $time ;
my $year = 1970 ; # start year
my ($month, $day, $julian) ;
my $hour = "00" ;
my $min = "00" ;
my $sec = "00" ;
my $msec = "0" ;
if ( $time =~ /\./ ) { # compute fraction of seconds
($time, $msec) = ($time =~ /(\d+)\.(\d+)/) ;
}
my $lyear = 31622400 ; # 366 days (leap year)
my $nyear = 31536000 ; # 365 days
my $dyear ; # dummy var
my %month_day ;
# calculate year
while ( $time >= 0 ) {
&_checkFebruary($year) ; # fix @month_day array
$dyear = (&_checkLeapYear($year) == 28 ? $nyear : $lyear) ;
if ( $time - $dyear < 0 ) {
last ;
} else {
$time -= $dyear ;
$year ++ ;
}
}
# calculate month/day and julian day
$julian = int $time / 86400 + 1 ; # start = 001 (not 000)
$time -= ($julian - 1) * 86400 ;
$julian = "0" x (3 - length($julian)).$julian ;
($month, $day) = _getMonthDay($year, $julian) ;
# calculate hour
$hour = int $time / 3600 ;
$time -= $hour * 3600 ;
$hour = "0" x (2 - length($hour)).$hour ;
# calculate min
$min = int $time / 60 ;
$time -= $min * 60 ;
$min = "0" x (2 - length($min)).$min ;
# calculate seconds and fraction of seconds
$sec = "0" x (2 - length($time)).$time ;
$msec = "0" x (3 - length($msec)).$msec ;
# format output
($format = $format ) =~ s/\%Y/$year/g ; # xxxx
($format = $format ) =~ s/\%m/$month/g ; # 1-12
($format = $format ) =~ s/\%d/$day/g ; # 01-31
($format = $format ) =~ s/\%j/$julian/g ;# 001-366
($format = $format ) =~ s/\%H/$hour/g ; # 00-23
($format = $format ) =~ s/\%M/$min/g ; # 00-59
($format = $format ) =~ s/\%S/$sec/g ; # 00-59
($format = $format ) =~ s/\%s/$msec/g ; # 000-999
($format = $format ) =~ s/\%E/$mtime/g ; # xxxxxxxxxxx.xxx
return $format ;
}
# Accepted input dates
# 2005-03-28 12:00:00
# 2005-03-28
# 2005-102 12:00:00
# 2005-102
# 2005102 12:00:00
# 2005102
sub date2sec {
my ($self, $time) = @_ ; # object
if ( ref ($self) !~ /TimeFormat/ ) {
$time = $self ;
}
$time = $self->{date} if ! defined $time ;
return 0 if ! defined $time ;
my ( $year,$month, $day, $julian, $hour, $minute, $second ) ;
# split date
if ( $time =~ /\d{4}-?(\d{3})\s{0,}?(\d\d:\d\d:\d\d)?/ ) {
( $year, $julian, $hour, $minute, $second ) = ( $time =~ m/
(\d{4}) # year
(?: # group the day - time portions
-? # optional hyphen
(\d{1,3}) # julian day
(?: # group the time portions
\s+ # one or more whitespace
(\d\d) # hour
:
(\d\d) # minute
:
(\d\d.*) # second + fractions
)? # time is optional
)? # day - time is optional
/xg );
return 0 unless _checkDateFormat( $year, $julian );
($month, $day) = _getMonthDay($year,$julian) ;
} elsif ( $time =~ /\d{4}-\d{2}-\d{2}\s{0,}?(\d\d:\d\d:\d\d)?/
+ ) {
( $year, $month, $day, $hour, $minute, $second ) = ( $time =~ m/
(\d{4}) # year
- #
(\d{2}) # month
- #
(\d{2}) # day
(?: # group the time portions
\s+ # one or more whitespace
(\d\d) # hour
:
(\d\d) # minute
:
(\d\d.*) # second + fractions
)? # time is optional
/xg );
return 0 unless _checkDateFormat( $year,$month,$day );
$julian = _getJulian($year, $month, $day) ;
} else {
return $time =~ /\d+/ || $time =~ /\d+\.\d+/ ? $time : 0 ; # 0 = f
+ailed to parse data
}
# convert data2seconds, should be good until the year 2100
my $depoch = 0 ;
$depoch += (($year - 1970) * 31536000);
$depoch += (int (($year - 1969) / 4)) * 86400; # correct for le
+ap years
$depoch += (($julian - 1) * 86400);
$depoch += ($hour * 3600);
$depoch += ($minute * 60);
$depoch += $second;
return $depoch ;
}
sub setDate {
my $self = shift ;
$self->{date} = shift ;
return 1 ; # OK
}
# predefine the output format (only works when using OOPerl)
sub format {
my $self = shift ;
$self->{format} = shift ;
if ( defined $self->{seconds} ) {
$self->sec2date() ;
}
return 1 ; # OK
}
sub _getJulian {
my ($year, $month, $day) = @_ ;
&_checkFebruary($year) ;
my $julian = 0 ;
my $key ;
for( my $i = 0; $i < $month -1; $i++) {
$key = "0" x(2 - length($i)).$i ;
$julian += $month_day[$key] ;
}
$julian += $day ;
return $julian ;
}
sub _getMonthDay {
my ($year, $julian) = @_ ;
my $month ;
_checkFebruary($year) ;
for(my $i = 0; $i < 12; $i++) {
if ( $month_day[$i] >= $julian ) {
$month = $i + 1 ;
last ;
}
$julian -= $month_day[$i] ;
$month = $i + 1;
}
return ("0" x (2 - length($month)).$month,"0" x (2 - length($julia
+n) ).$julian ) ;
}
# This function check if the given string is a correct date.
sub _checkFebruary {
$month_day[1] = _checkLeapYear(shift) ;
}
sub _checkDateFormat {
my ( $year,$month,$day ) = @_;
&_checkFebruary($year) ;
if ( defined $day ) { # input format is %Y-%m-%d
$month = '0'. $month if length( $month ) == 1;
return $day <= $month_day[ $month - 1 ] ? 1 : 0 ;
} else {
my $jday = $month ; # month represents julian day
return _checkLeapYear($year) == 28 ? $jday <= 365 ? 1 : 0 : $jd
+ay <= 366 ? 1 : 0 ;
}
}
sub _checkLeapYear {
my ( $self, $year ) = @_ ;
$year = $self if ( ! defined $year ) ;
return 29 unless $year % 400;
return 28 unless $year % 100;
return 29 unless $year % 4;
return 28;
}
# constructor
sub new {
my ( $class, $time, $format ) = @_ ;
$time = date2sec($time) if defined $time ; # convert input to epoc
+h-seconds
return bless { seconds => $time, format => $format }, __PACKAGE__
+;
}
__END__
Thanks a lot Luca
Re: Wrote my own date parser
by blokhead (Monsignor) on Feb 11, 2006 at 20:04 UTC
|
My first impression is that sec2date is nothing more than POSIX::strftime, but with nonstandard string escapes (like %E and your definitions of %S and %s). Because of this, your code is going to be very confusing for anyone familiar with POSIX::strftime.
If all you wanted to do was get strftime-like behavior while also supporting milliseconds, it would be much easier to simply write a wrapper around stftime, something like this (untested)
use POSIX 'strftime';
## similar to strftime but with milliseconds via "%q"
sub sec2date {
my ($fmt, $sec) = @_;
## fractional part of $sec, rounded to 3 digits
my $milli = sprintf "%03.0f", 1000 * ($sec - int $sec);
## %q becomes our milliseconds
$fmt =~ s/%q/$milli/g;
strftime $fmt, localtime(int $sec);
}
The same could be said for your date2sec code. Epoch calculation is not something I think you really want to do. I don't mind you writing your own regexes to extract the appropriate parts from your format, but I would be really suspicious of anyone's hand-rolled epoch second converter. Especially when you have a comment like this:
# convert data2seconds, should be good until the year 2100
"Should be good" ?? Yikes!
Instead, I would use Time::Local to do the epoch-seconds conversion part. In your regex, extract the fractional seconds to another variable, send everything to timelocal, and then add on the fractional seconds at the end. Here is how you can use timelocal, even only having year+Julian. You see, it handles all the leap year calculation for you.
use Time::Local 'timelocal_nocheck';
## given year+julian date,
my $epoch = timelocal_nocheck $sec, $min, $hour, $julian, 0, $year-190
+0;
## or given y/m/d (could use normal "timelocal" here)
my $epoch = timelocal_nocheck $sec, $min, $hour, $day, $mon-1, $year-1
+900;
## add milliseconds on at the end:
$epoch .= ".$milli";
Those are my meta-comments about what you're trying to accomplish. As for your code specifically, there are a few things going on that seemed strange to me..
You have this kind of construct several times:
$sec = "0" x (2 - length($time)).$time ;
$msec = "0" x (3 - length($msec)).$msec ;
This is written a little more clearly as:
$sec = sprintf "%02d", $time;
$msec = sprintf "%03d", $msec;
You have a zillion lines like this:
($format = $format ) =~ s/\%Y/$year/g ; # xxxx
($format = $format ) =~ s/\%m/$month/g ; # 1-12
($format = $format ) =~ s/\%d/$day/g ; # 01-31
Why $format = $format ? Looks like you saw a snippet somewhere that said ($new = $old) =~ s///, but that's completely redundant here as you aren't saving the old value. Better would be:
for ($format) {
s/%Y/$year/g;
s/%m/$month/g;
...
}
But even better yet would be to put all your date information in a hash, and you can replace all those s/// commands with just one:
my %data = (
Y => $year,
m => $month,
d => $day,
...
);
$format =~ s/\%([A-Za-z])/ exists $data{$1} ? $data{$1} : "%$1" /ge;
Update: fixed sprintf format in first code snippet
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Wrote my own date parser
by qq (Hermit) on Feb 12, 2006 at 12:45 UTC
|
We've all done bits of that, and its not bad for practice. But you'll be better off in the long run using DateTime. Read up on leap seconds in the docs, ask yourself if you want to really deal with that (and all the other stuff - daylight savings, leap years, time zones) or if you'd rather let someone else do it.
Perl Advent Calendar on DateTime. perl.com article. DateTime home.
| [reply] [Watch: Dir/Any] |
Re: Wrote my own date parser
by deprecated (Priest) on Feb 12, 2006 at 17:00 UTC
|
I think that qq is correct in that we've all done a lot of that (certainly look a few years ago and I've got some examples of that), and that it is helpful for learning.
A couple things bug me about what you've suggested however. First, you say that you're a beginner programmer. Yet, you also say that you weren't particularly impressed with the available date modules. My guess is this stems from some unfamiliarity with their methodology. Date::Calc is pretty unintuitive, but it certainly gets the job done. Additionally, it's pretty fast. Going pure-perl, and going it alone, is almost certainly going to be slower and harder to work with. Date::Calc is also very mature, having gone through many spirals and thousands of use scenarios.
Anyways, I wanted to suggest that you could take another look at Date::Calc. You'll probably find that it's easier to work with other people's code when you're more familiar with it. The other thing I wanted to suggest is if you work with databases, they generally have very helpful date functions, with casting into all sorts of different types. They're also generally a lot faster than perl (I do try to offload everything I can onto the database, rather than let the host machine spin on perl).
* http://www.postgresql.org/docs/8.1/interactive/functions-datetime.html
Toodles,
dep
| [reply] [Watch: Dir/Any] |
Re: Wrote my own date parser
by jeanluca (Deacon) on Feb 13, 2006 at 13:18 UTC
|
Sorry, ignore my previous message, the time conversions work OK. I will investigate the suggested modules to see how I will change my module. When I'm done I'll post the result!
Thanks a lot Luca | [reply] [Watch: Dir/Any] |
Re: Wrote my own date parser
by jeanluca (Deacon) on Feb 13, 2006 at 12:08 UTC
|
Thanx a lot! But I noticed one problem. When I convert times, those suggested modules convert the time to GMT time. So the hour changes (offset +1). I can compute the offset like:
use DateTime;
my $now = DateTime->now( time_zone => 'local' );
print $now->time_zone->offset_for_local_datetime( $now );
but I assume there is a way to set the offset to zero ? For what I need to do, the input-time will always be GMT time, no matter what the localtime of the system is!
An other thing I assume you always have todo is to
Luca | [reply] [Watch: Dir/Any] [d/l] |
|
|