http://qs321.pair.com?node_id=1009291

nvivek has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

    I need to extract file names matches with pattern I given through command line. For getting command line arguments, I used getopt in shell script. I planned to use find command with framing the iname appropriately to get the files which matches with command line arguments passed.
   Here, I have a doubt as will find command better than ls command or ls will be better than find command or is there any other command to locate the files in current directory quickly.
   Kindly suggest me which one is better.
   Note: Directory contains more than 1000 files in it and approximately size will be 2.5GB.

Replies are listed 'Best First'.
Re: Finding files in one directory
by LanX (Saint) on Dec 18, 2012 at 07:13 UTC
    see readdir incl. sample code ... or glob.

    Cheers Rolf

      Thanks for your prompt reply Rolf. Will this readdir module will be better than find command ( performance wise )?
        If you're on a unix/linux system, running a "find" command in a subshell is okay, while perl's readdir will almost always do pretty much just as well in terms of performance, and I've seen one or two cases where perl does better. The nice thing about readdir is that you don't need to worry about possible artifacts in file names that affect the text output from "find" (e.g. it's possible to have things like line-feeds and carriage-returns embedded in file names).

        Whenever I've tried to benchmark File::Find against unix "find" and simple (recursive) readdir, File::Find took noticeably longer to finish on relatively large directory structures. If you aren't dealing with nested directories, you don't need recursion, and readdir is definitely the easiest/best way to go.

        BTW, the time needed to scan all the file names in a directory (or traverse a directory tree) is not affected by the quantity of data stored in the files; it's purely a matter of how many files per directory, and how many directories.

        (The one case where a unix "find" command did worse that perl's "readdir" was on a ridiculously large directory - like a million files, all with fairly long names. Apparently, "find" (on a BSD system) was trying to hold the file names in memory, and at a certain point, it had to start using swap space, causing a geometric (exponential?) slow-down. Meanwhile, the run time for a simple perl script with  while($f=readdir(DIR)){...} was linear with the number of files, regardless of directory size.)

Re: Finding files in one directory
by frozenwithjoy (Priest) on Dec 18, 2012 at 07:19 UTC
    LanX's suggestion will work well for what you need. If you ever do anything more complex where you need to find files in a jumble of directories and subdirectories (and do something with them), take a look at File::Find.
      Or File::Find::Rule for a much better interface to File::Find.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      My blog: Imperial Deltronics
Re: Finding files in one directory
by bart (Canon) on Dec 18, 2012 at 11:58 UTC
    ls (or, in perl, glob) will work fine if you just need files from a single directory level. Note that you can even use glob '*/*.txt' so it doesn't even have to be all in the same subdirectory.

    If you need files on any level under a root directory, you can indeed call out (on Linux and related) to find, or, in plain perl only, use the standard module File::Find. The latter is probably not as fast, but at least it's cross platform portable.

    p.s. Note that in File::Find using a sub named "wanted" is a leftover from its perl-4 heritage, you can name your sub anything, or use an anonymous sub, like this:

    my @files; find sub { push @files, $File::Find::name if -f and /\.txt$/ }, '.'; # now you have an array of file names: foreach(@files) { ... }
Re: Finding files in one directory
by vagabonding electron (Curate) on Dec 18, 2012 at 14:34 UTC
    I recently "discovered" File::Next for myself, seems to be good for your task too, especially with the option "file_filter" to which you can provide your pattern.
    Just my 1 cent.
Re: Finding files in one directory
by bimleshsharma (Beadle) on Dec 18, 2012 at 10:53 UTC

    readdir function is good way to go ahead. Here is sample code you can use and modify as per your requirement

    sub findfile { my $searchfile= shift; my $pathofbasedir= shift; opendir DIR,"pathofbasedir"; my @usefulfiles = grep{!/^\.{1,2}$/} readdir DIR; my @fileswithfullpath= map{$basepath."/".$_}@usefulfiles; closedir(DIR); foreach my $file (@fileswithfullpath) { if(-d $file) { findfile($searchfile,$file) } else { if( $file =~ /$searchfile/) { print "FIle exists at ",$file; exit(1); } } } }