Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: How do I find and delete files based on age?

by scorpio17 (Canon)
on Feb 26, 2007 at 15:11 UTC ( [id://602141]=note: print w/replies, xml ) Need Help??


in reply to How do I find and delete files based on age?

As you know, "there's more than one way to do it...", but this may help get you started:
#!/usr/bin/perl use strict; use File::Find; if ($ARGV[0] eq "") { $ARGV[0]="."; } my @file_list; find ( sub { my $file = $File::Find::name; if ( -f $file && $file =~ /^DATE_/) { push (@file_list, $file) } }, @ARGV); my $now = time(); # get current time my $AGE = 60*60*24*14; # convert 14 days into seconds for my $file (@file_list) { my @stats = stat($file); if ($now-$stats[9] > $AGE) { # file older than 14 days print "$file\n"; } }
Assuming you name this script cleanup.pl, you would use it like this:
cleanup.pl /var/backups/repository
If you don't specify a directory, it will use whatever the current directory is. Note than the stats function returns an array of info, which I'm saving into the @stats array. Element 9 contains the last modification time, which may be different than the actual creation time (read up on stats so you know which one you want to use).

Also, this example just prints out the files starting with DATE_ that are 14 days old (or older). Change the print statement to:

unlink $file;
to actually delete them. This may leave you with empty directories, but you can write another script to delete empty directories after running this one.

Replies are listed 'Best First'.
Re^2: How do I find and delete files based on age?
by blazar (Canon) on Feb 28, 2007 at 11:40 UTC
    use strict;

    Why not

    use warnings; # as well?
    if ($ARGV[0] eq "") { $ARGV[0]="."; }

    Later on you say: "If you don't specify a directory, it will use whatever the current directory is." Had you warnings turned on, this would trigger an 'uninitialized' warning. Which is sensible: actually $ARGV[0] would be undefined rather than strictly equal to the empty string. I would use the simpler

    @ARGV = '.' unless @ARGV;

    so that all the directories supplied on the command line would be searched, and a reasonable default would be provided if none is specified. Granted: this is not meant as a harsh critique to your code. I know it is just an example. I only want to expand a little bit on the subject.

    my @file_list; find ( sub { my $file = $File::Find::name; if ( -f $file && $file =~ /^DATE_/) { push (@file_list, $file) } }, @ARGV);

    Two things:

    1. I like to use File::Find's no_chdir mode, so that I wouldn't need $File::Find::name. As of now your code is actually wrong, since find() is changing dir, so that $file which is a path relative to the base dir being searched, will be interpreted relative to the cwd, and -f will most likely fail, except for coincidences;
    2. I used to write such code too, that first collects filenames, and then process them later. If huge volumes of files are to be skimmed through, though, this may make the script seemingly "hang" before it says something interesting. Thus nowadays I avoid doing so, if possible. In this particular case I see no reason why the check on the date couldn't be made in the sub that is supplied to find() in the first place. (Ok, the resulting code wouldn't do exactly the same as yours, the difference being given a few seconds or at most minutes whereas the threshold is measured in days - so I wouldn't regard it as significative.)
    my @stats = stat($file); if ($now-$stats[9] > $AGE) { # file older than 14 days

    I know you probably know, and include an intermediate passage for clarity and instructive purposes, but it is perhaps worth reminding that one can take a list slice as well, and that the temporary @stats variable is not needed:

    if ($now-(stat $file)[9] > $AGE) { # file older than 14 days

    BTW: I am the first one to say one shouldn't care about premature optimization, but stats are known to be expensive, and $file is already statted when it's being searched, so one more reason to do the check at find() time.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://602141]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (1)
As of 2024-04-25 05:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found