Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Sort an array which contains date formatted elements

by msk_0984 (Friar)
on Jul 17, 2007 at 12:07 UTC ( [id://627012]=perlquestion: print w/replies, xml ) Need Help??

msk_0984 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Respected Monks,

This one is mine and I am sorry for my last question. Any ways I am greping out filenames from a particular directory and displaying all the file contents which are nothing but log file. But my file names are being created day wise ... Because of which the filenames are not getting sorted and it is displaying the logs in a incorrect order.

Example: (filenames) webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log
#### upper code ......... still present ##### my $dir = './logs/commonlogs'; opendir my $dh, $dir or die "Can't opendir '$dir': $!\n"; my @files = grep { ! -d "$dir/$_" and ! /user.log/ } readdir $dh; foreach my $file ( @files ) { ## .....Some code written for filtering .. open(FH, $file ) or die "Error : $! \n" ; while($audit_data=<FH>) { @check=split(']\[|]\s+|^\[',$audit_data); print " $check[2] $check[3] $check[1] $check[4] \n"; ### } }
Output :

Fri Jul 13 00:25:16 2007 india01 Syslog_Probe Properties file backup taken successfully

Fri Jul 13 00:25:16 2007 india01 ObjectServer Successfully Changed to MSK_P

Sat Jul 7 14:33:02 2007 india01 Stopping the PAD

Sat Jul 7 14:33:10 2007 india01 Starting the PAD

Sat Jul 9 15:17:51 2007 station18 Install login failed

Sat Jul 10 10:17:51 2007 station18 Install Interface entry

Sat Jul 11 00:33:02 2007 india01 Stopping the PAD

The output is coming in an incorrect order because the filenames grep out are coming in an incorrect order. So wat I did was sorted that array

@files_sorted = sort { $a <=> $b } @files;
So the output is still not in sorted order it is taking the elements as strings even though i gave it to sort in numerical way. So how to sort that array in keep that array in sorted order according to dates as
Required output --> <p><p> webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log <p>
Thanks in advance ..
Sushil Kumar

Replies are listed 'Best First'.
Re: Sort an array which contains date formatted elements
by moritz (Cardinal) on Jul 17, 2007 at 12:35 UTC
    You should transform the filenames to something that can easily be compared:

    my %month = ( jan => 1, feb => 2, ... ); m/webadmin_([^_]+)_(\d+)_(\d{4})\.log/; my ($month, $day, $year) = ($1, $2, $3); my $new_filename = sprintf "%04d-%02d-%02d", $year, $month{$month}, $d +ay;

    You can combine that with a Schwartzian Transform.

Re: Sort an array which contains date formatted elements
by cdarke (Prior) on Jul 17, 2007 at 13:26 UTC
    Alternatively use a custom sort. It is a little more complicated than normal because of the date format. For example:
    use strict; my @files = qw (webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log); sub bydate { # Extract non-numeric and date # Assumes text after date is the same my ($afront, $aday, $ayear) = $a =~ /^([[:alpha:]_]+)(\d+)_(\d+)/; my ($bfront, $bday, $byear) = $b =~ /^([[:alpha:]_]+)(\d+)_(\d+)/; my $retn = ($afront cmp $bfront); $retn = ($ayear <=> $byear) if $retn == 0; $retn = ($aday <=> $bday) if $retn == 0; return $retn } my @sorted = sort bydate @files; local $" = "\n"; print "@sorted\n";
Re: Sort an array which contains date formatted elements
by snopal (Pilgrim) on Jul 17, 2007 at 14:45 UTC

    If you can be assured that your files are not altered after the log moves on (big assumption that must be verified), you might just use a 'stat' mtime as your sort key.

    Also, if you want to do a system call using backtics, you can let the file system sort your files for you (assuming UNIX).

    my $files = `/bin/ls -1tr $dir`; my @files = grep { ! -d "$dir/$_" and ! /user.log/ } split /\n/, $file +s;
      Respected monks

      Thank You all for your prompt replies and my problem has been sorted out. I could really learn a lot from each and every idea that has beem posted to me. Here in Perlmonks site I have evoled myself and have learnt a lot.

      Snopal

      This is also a very good idea to sort the files up on the modification time as the latest file is the one which is modified according to the dates.

      This also has done the trick for me but can you please explain me wat does "1" do there becos since -t and -r we use them for the modification time and to reverse sort. But "1" i was not able to find wat does 1 do ...

      Any ways thank uo very much for ur solution too.

      Sushil Kumar
Re: Sort an array which contains date formatted elements
by salva (Canon) on Jul 17, 2007 at 16:35 UTC
    You can generate multikey sorters easily with Sort::Key:
    use Sort::Key::Multi 'iiikeysort'; # the 'iii' stands for three integer keys my (%month, $month); $month{$_} = ++$month for qw(jan feb mar apr ...); @sorted = iiikeysort { /^webadmin_(\w+)_(\d+)_(\d+)\.log$/; ($3, $month{$1}, $2) } @filenames;
Re: Sort an array which contains date formatted elements
by dsheroh (Monsignor) on Jul 17, 2007 at 15:39 UTC
    snopal beat me to the suggestion of using filesystem data to sort them correctly.

    The reason that "it is taking the elements as strings even though i gave it to sort in numerical way" is because the filenames start with text, not numbers. If they were formatted as 7_jul_2007_webadmin.log, then your sort idea would have worked (until 1_aug_webadmin.log, at least - any change of month would break it) but, with the current filename formatting, all of the filenames have a numerical value of 0, so sorting them numerically doesn't do much.

Re: Sort an array which contains date formatted elements
by injunjoel (Priest) on Jul 17, 2007 at 18:29 UTC
    Late to the game I see, but for the sake of TIMTOWTDI here is my suggestion.
    Use a Schwartzian Transform.
    use Time::Local; my %months; @months{('jan','feb','mar','apr','may','jun','jul','aug','sep','oct',' +nov','dec')} = 0..11; my @lines = map{ $_->[1] } sort{ $a->[0] <=> $b->[0] } map{ chomp; my $val = $_; $val =~ s#webadmin_([^\.]+)\.log#my @t=split(/_/,$1);timelocal +(0,0,0,$t[1],$months{$t[0]},($t[2]-1900))#e; [$val,$_]; }<DATA>; print "$_\n" for(@lines); __DATA__ webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log
    The Output
    webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log


    -InjunJoel
    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo
      I agree with your suggestion to use a ST but my preference would be to use a regex to pull out the month, day and year elements and do a three-way sort, greping only those filenames that match.

      use strict; use warnings; my %months; @months{qw{ jan feb mar apr may jun jul aug sep oct nov dec}} = 0 .. 11; my $rxWanted; { local $" = q{|}; $rxWanted = qr {(?x) \A webadmin _(@{ [ keys %months ] }) _(\d\d?) _(\d{4}) \.log \z }; } print map { qq{$_->[0]\n} } sort { $a->[3] <=> $b->[3] || $months{$a->[1]} <=> $months{$b->[1]} || $a->[2] <=> $b->[2] } grep { defined $_->[1] } map { chomp; [ $_, m{$rxWanted} ] } <DATA>; __END__ . .. webadmin_jul_10_2007.log webadmin_jul_11_2007.log webadmin_jul_12_2007.log webadmin_jul_13_2007.log webadmin_jul_14_2007.log webadmin_jul_7_2007.log webadmin_jul_8_2007.log webadmin_jul_9_2007.log user.log

      The output is as yours.

      Cheers,

      JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://627012]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-19 23:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found