Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^2: How to get minimum start date in these start dates ?

by JediWizard (Deacon)
on Jul 10, 2006 at 14:17 UTC ( #560146=note: print w/replies, xml ) Need Help??

in reply to Re: How to get minimum start date in these start dates ?
in thread How to get minimum start date in these start dates ?

cog While I agree that a Schwartzian transfrom is a good idea here, I would like to make two small comments.

1. Rather than doing three comparisons, first on year, then month, then day, I believe (though I haven't benchmarked it) that it would be faster to do a single comaprison of a string in the form yyyymmdd, which can be easily created with a regex.

2. Depending on your data set, it maybe considerable faster to use an Orcish manouver. Especially if the same date may appear multiple time in the list (and I didn't nessicarily see anthing in the post to indicate that that wouldn't happen).

my(%date_hash) = (); my(@start_date) = qw(01-06-2007 01-08-2006 01-06-2006 01-07-2007 06-01 +-2007); my @sorted_dates = sort({ ($date_has{$a} ||= &trans_date($a)) <=> ($date_has{$b} ||= &trans_date($b)) } @start_date); print join("\n", @sorted_dates); sub trans_date { my $date = shift; $date =~ s/(\d{2})-(\d{2})-(\d{4})/$3$2$1/; return $date; }

They say that time changes things, but you actually have to change them yourself.

—Andy Warhol

Replies are listed 'Best First'.
Re^3: How to get minimum start date in these start dates ?
by cog (Parson) on Jul 10, 2006 at 15:10 UTC
    While I don't disagree with you, I find the Schwartzian transform easier to understand and memorize than an Orcish Manouver, from the view point of a newbie.

    Also, the benchmarking would depend largely on the data set (suppose all the years are different, for instance).

    Still, I'm inclined to believe that speed won't be relevant, in this case :-) Just a hunch, you know? :-)

      I agree with you cog regarding the the ST over the OM but that's probably because I've never used the OM in anger so I'm not familiar with it. I think that both JediWizard's solution and yours overcomplicate the transformation of the date into a sortable form. Just reversing the date to sort it and then reversing it again to extract it seems much simpler and quicker to me. I have done some benchmarking which seems to bear this out. I've also corrected a couple of typos (you had missed a closing quote in one of your hash keys but I've unquoted them all and JediWizard had doubled his quote words like qw(qw( ... )). Here is the code

      use strict; use warnings; use Benchmark qw(cmpthese); # Generate a thousand dates at random. # my @startDates; push @startDates, sprintf(q{%02d}, int((rand 28) + 1)) . q{-} . sprintf(q{%02d}, int((rand 12) + 1)) . q{-} . int((rand 25) + 2000) for (1 .. 1000); # cog's method. # my $rcCog = sub { my @sortedDates = map { $_->{date} } sort { $a->{year} <=> $b->{year} or $a->{month} <=> $b->{month} or $a->{day} <=> $b->{day} } map { /(\d\d)-(\d\d)-(\d\d\d\d)/; { date => $_, day => $1, month => $2, year => $3 } } @startDates; return $sortedDates[0]; }; # JediWizard's method. # my $rcJediWizard = sub { my %dateHash = (); my @sortedDates = sort { ($dateHash{$a} ||= transDate($a)) <=> ($dateHash{$b} ||= transDate($b)) } @startDates; return $sortedDates[0]; }; # johngg's method. # my $rcJohnGG = sub { return ( map {join q{-}, reverse split /-/} sort map {join q{-}, reverse split /-/} @startDates )[0]; }; # Run all three on data to prove they come up with # the same answer. # print q{$rcCog->() - }, $rcCog->(), qq{\n}; print q{$rcJediWizard->() - }, $rcJediWizard->(), qq{\n}; print q{$rcJohnGG->() - }, $rcJohnGG->(), qq{\n}; # Run the benchmark # cmpthese (50, { Cog => $rcCog, JediWizard => $rcJediWizard, JohnGG => $rcJohnGG }); # JediWizard's date translation routine. # sub transDate { my $date = shift; $date =~ s/(\d{2})-(\d{2})-(\d{4})/$3$2$1/; return $date; }

      And these are the results

      $rcCog->() - 13-01-2000 $rcJediWizard->() - 13-01-2000 $rcJohnGG->() - 13-01-2000 Rate JediWizard Cog JohnGG JediWizard 6.00/s -- -0% -61% Cog 6.00/s 0% -- -61% JohnGG 15.2/s 153% 153% --

      Looks like your hunch about speed was correct in that you and JediWizard pan out about the same (seems to go either way over several runs but the one I captured here was a dead heat). However, my simpler solution appears to be consistently quicker.

      I hope this is of interest.



        "I've never used the OM in anger"

        Niether have I. I'm actually a bit curious about what one might look like if written "in anger".


        They say that time changes things, but you actually have to change them yourself.

        —Andy Warhol

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://560146]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2022-01-21 09:19 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (57 votes). Check out past polls.