http://qs321.pair.com?node_id=60421


in reply to Deleting Files

How about something like:
#!perl -w use strict; use File::Find; my $days = 12 * 7; # twelve weeks, seven days a week. sub DeleteOldFiles { return 0 unless -M > $days; unlink $_ or die $!; return 1; } find( \&DeleteOldFiles, '.' );
Although I think that also sweeps sub directories too.

Update
Hmmm, File::Find is really for delving into the subdirs, and if you don't want that, then just glob(*) like this:

#!perl -w use strict; my $days = 12 * 7; # twelve weeks, seven days a week. while( <*> ) { next if -d; unlink $_ or die $! if -M > $days; }
By the way, -M returns days since last modified, -A returns days since last accessed. I wasn't sure what your purpose was, but recently accessed files might be useful.

As for speed issues, you have to loop over everyfile, no matter what, the only question is the efficiancy of the loop. Well, do as little as possible in the loop. Calculate the age aforehand, and short circuit where you can.

Replies are listed 'Best First'.
Re: Re: Deleting Files
by McD (Chaplain) on Feb 24, 2001 at 02:57 UTC
    Adam writes:

    Hmmm, File::Find is really for delving into the subdirs,

    I know, and it's difficult to grok sometimes how to use it correctly - the whole "prune/not prune" thing is still unclear to me. But besides that...

    and if you don't want that, then just glob(*) like this:

    For a while now I've been pondering which was faster, a glob or a readdir, so I decided to test it on a big directory I've got lying around (36K files).

    The answer?

    glob failed (child exited with status 1) at trial.pl line 12.

    Nuts. Do I recall somewhere that Perl relies on the shell for globbing in some way? If so, going with a readdir on a big directory may be your only choice.

    As for speed issues, you have to loop over everyfile, no matter what,

    And, although I can't speak for NTFS/HPFS/FAT filesystems, I know that a flat directory on Solaris or Linux is going to be a serious dog to scan over 1000 files, no matter what.

    Implement a heirarchical subdirectory scheme - maybe based on the date, which would simplify purging, too.

    That's what I did, faced with a similar problem, anyway. :-)

    Peace,
    -McD

      With Perl 5.6 globbing no longer uses the shell.

      As for the filesystem, ReiserFS on Linux is supposed to handle that kind of directly smoothly. However ext will slow to a crawl, and NTFS appears to as well.

      The problem, of course, is that every mention of a file requires scanning the list of things in the directory, which means that you scan a list of many thousands of files many thousands of times. Unless the filesystem is designed for that, you have a problem.

      Recommended solutions? Hierarchical structures (which is what most filesystems are designed to do), a dbm, a

Re: Re: Deleting Files
by BatGnat (Scribe) on Feb 23, 2001 at 08:34 UTC
    What it actually is, is that the 450,000+ files are fax images, it used to be cleaned once a week when it ran on OS/2. When they converted the system to NT, they failed to implement a purge.
    We now need to rectify that.

    BatGnat

    BALLOT: A multiple choice exam, in which all of the answers above are incorrect!
      Get an NT find and:
      find /image_dir -c +7d -exec rm {} \;

      a

        you can use unix command find like this
        find . -mtime +84 -exec rm -f {} \;
        ashok