Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Wrote my own date parser

by jeanluca (Deacon)
on Feb 11, 2006 at 19:29 UTC ( [id://529590]=perlmeditation: print w/replies, xml ) Need Help??

Dear Monks,
After I've studied some date parsers I was not very impressed. They were complex and often didn't accept to many different input dates (like fraction of seconds often were not allowed!)
Anyway, I'm not a very good programmer (beginner) so I would like to have some comments on the piece of code I wrote (some monks already helped me with the regexp!). Maybe I re-invented the wheel, but it was a good experience to deal with dates.......
This module accepts a few different formatted input dates, like:
2005-10-21 12:09:22.099 2005-10-21 12:09:22 2005-10-21 2005-240 12:09:22.099 2005240 12:09:22 1141128000 # epoch etc
Here is the module TimeFormat.pm that I wrote:

#! /usr/bin/perl package TimeFormat ; use strict ; use warnings ; require Exporter ; our ($VERSION, $ISA, @EXPORT) ; our @ISA = qw(Exporter); @EXPORT = qw( sec2date date2sec ) ; $VERSION = 1.0 ; our @month_day = (31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31) ; # This function accepts epoch-seconds and returns a formatted date. sub sec2date { my ( $self, $time, $format) = @_ ; if ( ref ( $self ) !~ /TimeFormat/ ) { $format = $time ; $time = $self ; } else { $format = $self->{format} if ( ! defined $format ) ; } $format = "%Y-%m-%d %H:%M:%S" if ( @_ == 1 && ! defined $format ) +; # no format defined, use default my $mtime = $time ; my $year = 1970 ; # start year my ($month, $day, $julian) ; my $hour = "00" ; my $min = "00" ; my $sec = "00" ; my $msec = "0" ; if ( $time =~ /\./ ) { # compute fraction of seconds ($time, $msec) = ($time =~ /(\d+)\.(\d+)/) ; } my $lyear = 31622400 ; # 366 days (leap year) my $nyear = 31536000 ; # 365 days my $dyear ; # dummy var my %month_day ; # calculate year while ( $time >= 0 ) { &_checkFebruary($year) ; # fix @month_day array $dyear = (&_checkLeapYear($year) == 28 ? $nyear : $lyear) ; if ( $time - $dyear < 0 ) { last ; } else { $time -= $dyear ; $year ++ ; } } # calculate month/day and julian day $julian = int $time / 86400 + 1 ; # start = 001 (not 000) $time -= ($julian - 1) * 86400 ; $julian = "0" x (3 - length($julian)).$julian ; ($month, $day) = _getMonthDay($year, $julian) ; # calculate hour $hour = int $time / 3600 ; $time -= $hour * 3600 ; $hour = "0" x (2 - length($hour)).$hour ; # calculate min $min = int $time / 60 ; $time -= $min * 60 ; $min = "0" x (2 - length($min)).$min ; # calculate seconds and fraction of seconds $sec = "0" x (2 - length($time)).$time ; $msec = "0" x (3 - length($msec)).$msec ; # format output ($format = $format ) =~ s/\%Y/$year/g ; # xxxx ($format = $format ) =~ s/\%m/$month/g ; # 1-12 ($format = $format ) =~ s/\%d/$day/g ; # 01-31 ($format = $format ) =~ s/\%j/$julian/g ;# 001-366 ($format = $format ) =~ s/\%H/$hour/g ; # 00-23 ($format = $format ) =~ s/\%M/$min/g ; # 00-59 ($format = $format ) =~ s/\%S/$sec/g ; # 00-59 ($format = $format ) =~ s/\%s/$msec/g ; # 000-999 ($format = $format ) =~ s/\%E/$mtime/g ; # xxxxxxxxxxx.xxx return $format ; } # Accepted input dates # 2005-03-28 12:00:00 # 2005-03-28 # 2005-102 12:00:00 # 2005-102 # 2005102 12:00:00 # 2005102 sub date2sec { my ($self, $time) = @_ ; # object if ( ref ($self) !~ /TimeFormat/ ) { $time = $self ; } $time = $self->{date} if ! defined $time ; return 0 if ! defined $time ; my ( $year,$month, $day, $julian, $hour, $minute, $second ) ; # split date if ( $time =~ /\d{4}-?(\d{3})\s{0,}?(\d\d:\d\d:\d\d)?/ ) { ( $year, $julian, $hour, $minute, $second ) = ( $time =~ m/ (\d{4}) # year (?: # group the day - time portions -? # optional hyphen (\d{1,3}) # julian day (?: # group the time portions \s+ # one or more whitespace (\d\d) # hour : (\d\d) # minute : (\d\d.*) # second + fractions )? # time is optional )? # day - time is optional /xg ); return 0 unless _checkDateFormat( $year, $julian ); ($month, $day) = _getMonthDay($year,$julian) ; } elsif ( $time =~ /\d{4}-\d{2}-\d{2}\s{0,}?(\d\d:\d\d:\d\d)?/ + ) { ( $year, $month, $day, $hour, $minute, $second ) = ( $time =~ m/ (\d{4}) # year - # (\d{2}) # month - # (\d{2}) # day (?: # group the time portions \s+ # one or more whitespace (\d\d) # hour : (\d\d) # minute : (\d\d.*) # second + fractions )? # time is optional /xg ); return 0 unless _checkDateFormat( $year,$month,$day ); $julian = _getJulian($year, $month, $day) ; } else { return $time =~ /\d+/ || $time =~ /\d+\.\d+/ ? $time : 0 ; # 0 = f +ailed to parse data } # convert data2seconds, should be good until the year 2100 my $depoch = 0 ; $depoch += (($year - 1970) * 31536000); $depoch += (int (($year - 1969) / 4)) * 86400; # correct for le +ap years $depoch += (($julian - 1) * 86400); $depoch += ($hour * 3600); $depoch += ($minute * 60); $depoch += $second; return $depoch ; } sub setDate { my $self = shift ; $self->{date} = shift ; return 1 ; # OK } # predefine the output format (only works when using OOPerl) sub format { my $self = shift ; $self->{format} = shift ; if ( defined $self->{seconds} ) { $self->sec2date() ; } return 1 ; # OK } sub _getJulian { my ($year, $month, $day) = @_ ; &_checkFebruary($year) ; my $julian = 0 ; my $key ; for( my $i = 0; $i < $month -1; $i++) { $key = "0" x(2 - length($i)).$i ; $julian += $month_day[$key] ; } $julian += $day ; return $julian ; } sub _getMonthDay { my ($year, $julian) = @_ ; my $month ; _checkFebruary($year) ; for(my $i = 0; $i < 12; $i++) { if ( $month_day[$i] >= $julian ) { $month = $i + 1 ; last ; } $julian -= $month_day[$i] ; $month = $i + 1; } return ("0" x (2 - length($month)).$month,"0" x (2 - length($julia +n) ).$julian ) ; } # This function check if the given string is a correct date. sub _checkFebruary { $month_day[1] = _checkLeapYear(shift) ; } sub _checkDateFormat { my ( $year,$month,$day ) = @_; &_checkFebruary($year) ; if ( defined $day ) { # input format is %Y-%m-%d $month = '0'. $month if length( $month ) == 1; return $day <= $month_day[ $month - 1 ] ? 1 : 0 ; } else { my $jday = $month ; # month represents julian day return _checkLeapYear($year) == 28 ? $jday <= 365 ? 1 : 0 : $jd +ay <= 366 ? 1 : 0 ; } } sub _checkLeapYear { my ( $self, $year ) = @_ ; $year = $self if ( ! defined $year ) ; return 29 unless $year % 400; return 28 unless $year % 100; return 29 unless $year % 4; return 28; } # constructor sub new { my ( $class, $time, $format ) = @_ ; $time = date2sec($time) if defined $time ; # convert input to epoc +h-seconds return bless { seconds => $time, format => $format }, __PACKAGE__ +; } __END__

Thanks a lot
Luca

Replies are listed 'Best First'.
Re: Wrote my own date parser
by blokhead (Monsignor) on Feb 11, 2006 at 20:04 UTC
    My first impression is that sec2date is nothing more than POSIX::strftime, but with nonstandard string escapes (like %E and your definitions of %S and %s). Because of this, your code is going to be very confusing for anyone familiar with POSIX::strftime.

    If all you wanted to do was get strftime-like behavior while also supporting milliseconds, it would be much easier to simply write a wrapper around stftime, something like this (untested)

    use POSIX 'strftime'; ## similar to strftime but with milliseconds via "%q" sub sec2date { my ($fmt, $sec) = @_; ## fractional part of $sec, rounded to 3 digits my $milli = sprintf "%03.0f", 1000 * ($sec - int $sec); ## %q becomes our milliseconds $fmt =~ s/%q/$milli/g; strftime $fmt, localtime(int $sec); }
    The same could be said for your date2sec code. Epoch calculation is not something I think you really want to do. I don't mind you writing your own regexes to extract the appropriate parts from your format, but I would be really suspicious of anyone's hand-rolled epoch second converter. Especially when you have a comment like this:
    # convert data2seconds, should be good until the year 2100
    "Should be good" ?? Yikes! Instead, I would use Time::Local to do the epoch-seconds conversion part. In your regex, extract the fractional seconds to another variable, send everything to timelocal, and then add on the fractional seconds at the end. Here is how you can use timelocal, even only having year+Julian. You see, it handles all the leap year calculation for you.
    use Time::Local 'timelocal_nocheck'; ## given year+julian date, my $epoch = timelocal_nocheck $sec, $min, $hour, $julian, 0, $year-190 +0; ## or given y/m/d (could use normal "timelocal" here) my $epoch = timelocal_nocheck $sec, $min, $hour, $day, $mon-1, $year-1 +900; ## add milliseconds on at the end: $epoch .= ".$milli";
    Those are my meta-comments about what you're trying to accomplish. As for your code specifically, there are a few things going on that seemed strange to me..

    You have this kind of construct several times:

    $sec = "0" x (2 - length($time)).$time ; $msec = "0" x (3 - length($msec)).$msec ;
    This is written a little more clearly as:
    $sec = sprintf "%02d", $time; $msec = sprintf "%03d", $msec;
    You have a zillion lines like this:
    ($format = $format ) =~ s/\%Y/$year/g ; # xxxx ($format = $format ) =~ s/\%m/$month/g ; # 1-12 ($format = $format ) =~ s/\%d/$day/g ; # 01-31
    Why $format = $format ? Looks like you saw a snippet somewhere that said ($new = $old) =~ s///, but that's completely redundant here as you aren't saving the old value. Better would be:
    for ($format) { s/%Y/$year/g; s/%m/$month/g; ... }
    But even better yet would be to put all your date information in a hash, and you can replace all those s/// commands with just one:
    my %data = ( Y => $year, m => $month, d => $day, ... ); $format =~ s/\%([A-Za-z])/ exists $data{$1} ? $data{$1} : "%$1" /ge;
    Update: fixed sprintf format in first code snippet

    blokhead

Re: Wrote my own date parser
by qq (Hermit) on Feb 12, 2006 at 12:45 UTC

    We've all done bits of that, and its not bad for practice. But you'll be better off in the long run using DateTime. Read up on leap seconds in the docs, ask yourself if you want to really deal with that (and all the other stuff - daylight savings, leap years, time zones) or if you'd rather let someone else do it.

    Perl Advent Calendar on DateTime. perl.com article. DateTime home.

Re: Wrote my own date parser
by deprecated (Priest) on Feb 12, 2006 at 17:00 UTC
    I think that qq is correct in that we've all done a lot of that (certainly look a few years ago and I've got some examples of that), and that it is helpful for learning.

    A couple things bug me about what you've suggested however. First, you say that you're a beginner programmer. Yet, you also say that you weren't particularly impressed with the available date modules. My guess is this stems from some unfamiliarity with their methodology. Date::Calc is pretty unintuitive, but it certainly gets the job done. Additionally, it's pretty fast. Going pure-perl, and going it alone, is almost certainly going to be slower and harder to work with. Date::Calc is also very mature, having gone through many spirals and thousands of use scenarios.

    Anyways, I wanted to suggest that you could take another look at Date::Calc. You'll probably find that it's easier to work with other people's code when you're more familiar with it. The other thing I wanted to suggest is if you work with databases, they generally have very helpful date functions, with casting into all sorts of different types. They're also generally a lot faster than perl (I do try to offload everything I can onto the database, rather than let the host machine spin on perl).

    * http://www.postgresql.org/docs/8.1/interactive/functions-datetime.html

    Toodles, dep

    --
    Tilly is my hero.

Re: Wrote my own date parser
by jeanluca (Deacon) on Feb 13, 2006 at 13:18 UTC
    Sorry, ignore my previous message, the time conversions work OK. I will investigate the suggested modules to see how I will change my module. When I'm done I'll post the result!

    Thanks a lot
    Luca
Re: Wrote my own date parser
by jeanluca (Deacon) on Feb 13, 2006 at 12:08 UTC
    Thanx a lot!
    But I noticed one problem. When I convert times, those suggested modules convert the time to GMT time. So the hour changes (offset +1). I can compute the offset like:
    use DateTime; my $now = DateTime->now( time_zone => 'local' ); print $now->time_zone->offset_for_local_datetime( $now );

    but I assume there is a way to set the offset to zero ? For what I need to do, the input-time will always be GMT time, no matter what the localtime of the system is! An other thing I assume you always have todo is to

    Luca

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://529590]
Approved by Corion
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-28 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found