Update: Seems I'm too slow today...
here is my quick hack..
#!/usr/bin/perl -w
use strict;
my $entries;
while ( my $line = <DATA> ){
$line =~ /\d?\W*(gr\d)\W*(\d*-\d\d-\d\d)/;
next if ( !$2 );
my $group = $1;
my $date = $2;
$date =~ s/-//g;
if ( ! defined( $entries->{$group}) ||
( $entries->{$group}->{date} < $date ) ){
$entries->{$group}->{date} = $date;
$entries->{$group}->{entry} = $line;
}
}
foreach (keys( %{$entries} )){
print "entry: $entries->{$_}->{entry}";
}
__DATA__
item group entry_date
34 gr1 2003-03-02
12 gr1 1990-03-14
39 gr3 2002-04-11
66 gr4 2006-03-16
32 gr3 1998-02-13
90 gr1 2004-06-15
55 gr4 1999-06-15
etc ...
2nd Update: On the other hand, my code is the onlyone which will not get confused by misformatted lines yet .. :-)
3rd Update:
Seems I'm bored..
I just did some benchmarking..
I created some testdata with the code below:
#!/usr/bin/perl -w
open F, ">testdata";
for ( 0..1000000 ){
print F "$_ gr".int(rand(10))." ". (1990+int(rand(25))) . '-
+0'. (int(rand(10))) . '-' . (10 + int(rand(20)) )."\n";
}
close F;
After this I did some measures:
my code:
time ./latestentries.pl
entry: 15970 gr5 2014-09-29
entry: 79485 gr8 2014-09-29
entry: 135788 gr7 2014-09-29
entry: 221 gr2 2014-09-29
entry: 18669 gr9 2014-09-29
entry: 46760 gr1 2014-09-29
entry: 4960 gr3 2014-09-29
entry: 9486 gr0 2014-09-29
entry: 19710 gr4 2014-09-29
entry: 56757 gr6 2014-09-29
real 0m8.689s
user 0m8.617s
sys 0m0.060s
-------------------
anno's code:
micha@laptop ~/prog/perl/test $ time perl test-anno.pl
962757, gr0, 2014-09-29
964472, gr1, 2014-09-29
984704, gr2, 2014-09-29
980128, gr3, 2014-09-29
985851, gr4, 2014-09-29
931318, gr5, 2014-09-29
976880, gr6, 2014-09-29
988367, gr7, 2014-09-29
992654, gr8, 2014-09-29
962175, gr9, 2014-09-29
real 0m4.556s
user 0m4.424s
sys 0m0.036s
-------------------
and duff's entry:
micha@laptop ~/prog/perl/test $ time perl test-duff.pl
100154 gr5 1990-00-10
5654 gr8 1990-00-10
2318 gr7 1990-00-10
9789 gr2 1990-00-10
19151 gr9 1990-00-10
91314 gr1 1990-00-10
124846 gr3 1990-00-10
14858 gr0 1990-00-10
175946 gr4 1990-00-10
95691 gr6 1990-00-10
real 0m3.497s
user 0m3.452s
sys 0m0.036s
The winner is duff.. :-)
He's the only one who looks for the eldest entry, AND wrote the fastest code... |