Hello my new favorite friends,

As part of a lexical processing project I'm working on, I'm parsing millions of dates and converting them to Epoch time. However, Diag::NYTProf showed me that I was losing massive amounts of time by using use Date::Parse::str2time; I guess that's the price you pay for something that seemed like the perfect, effortless way to parse the dates.

So, my question is, how can I most efficiently parse these dates, for those of you who have a sense of the benchmarks? Here was the WRONG way (removing it doubled my speed!):

# Dates of form 'Fri, 01 Mar 2013 01:21:14 +0000'
my $created_at = str2time($value);
[download]

Update: Solution

Thanks to the discussion between BrowserUK and rjt I high-speed solution came that looked something like this:

use Inline C => q@
int epoch_sec(char * date) {
    char *tz_str = date + 26;
    struct tm tm;
    int tz;

    if (  strlen(date) != 31                           ||
        strptime(date, "%a, %d %b %Y %T", &tm) == NULL ||
          sscanf(tz_str, "%d", &tz) != 1)
    {
        printf("Invalid date %s\n", date);
        return 0;
    }

    return timegm(&tm) - 
        (tz < 0 ? -1 : 1)*(abs(tz)/100*3600 + abs(tz)%100*60);
}
@;

our $date = "Fri, 01 Mar 2013 01:21:14 +0200";
my $newDate = epoch_sec($date);
say $newDate;
[download]

Thanks! You guys are incredible.

In reply to *SOLVED* High-speed Date Formatting by Endless

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Just another Perl shrine
	PerlMonks

comment on

Update: Solution