http://qs321.pair.com?node_id=427891


in reply to Behavior of File::Find's preprocess and glob

It appears that find_function($dirname) is actually called before (or after, depending on your options) the "wanted" function is called so you need to filter directories out in find_function.

If you do filter our directory names in the "preprocess" function then you'll prevent File::Find from recursing into those directories.

Update: d'oh! mixed up the function names..

#!/usr/bin/perl -w use strict; use Cwd; use File::Find; my $filespec = qr/\.(?:txt|pl)$/; my $dir = $ARGV[0] || getcwd(); find( { wanted => \&find_function, preprocess => \&globber }, $dir ); sub find_function { return if (-d $File::Find::name); print $File::Find::name.$/; } sub globber{ my @files; foreach(@_){ push(@files, $_) if (/$filespec/ || (-d $_)); } return @files; }

cheers,

J

Replies are listed 'Best First'.
Re^2: Behavior of File::Find's preprocess and glob
by hsinclai (Deacon) on Feb 04, 2005 at 02:13 UTC
    If you do filter our directory names in the "wanted" function then you'll prevent File::Find from recursing into those directories.
    Did you mean filter out the directory names in "preprocess" (as opposed to the "wanted" function) - That is why the recursion stopped! I see..

    .. it still doesn't answer the question as to why the "preprocess" glob of *.pl *.txt in my earlier snippet returns the directory name.. but thanks, I do see now how I was blocking File::Find from recursing by removing directories from the what "preprocess"'s return list ..

    Although your code rework works perfectly, the same result can be gotten with just doing directory and file-extension pattern matching in "wanted".. without bothering with a "preprocess" call ... so for example what if there were several hundred subdirectories which I knew did not contain *.pl or *.txt - I was trying to find a way that "wanted" could skip the needless processing/recursion..

      .. it still doesn't answer the question as to why the "preprocess" glob of *.pl *.txt in my earlier snippet returns the directory name

      Actually it doesn't. Your "wanted" function prints it when it is called with the directory name before "preprocess" is called with the directory contents. The first thing I did with your code was to add 'print "GLOBBING\n";' to the preprocess function. The directory name is printed before GLOBBING.

      Sounds like you want to change the $filespec regex to only exclude directories you don't wish to traverse. Then filter out the rest of the directories in the "wanted" function. You could create a list of directories as keys to a hash and have "preprocess" skip any directories in the list.

      Update: Added code

      #!/usr/bin/perl -w use strict; use Cwd; use File::Find; my $filespec = qr/\.(?:txt|pl)$/; my %dirskip = ( 'path/to/dir' => 1, 'path/to/another/dir' => 1 ); my $dir = $ARGV[0] || getcwd(); find( { wanted => \&find_function, preprocess => \&globber }, $dir ); sub find_function { return if $File::Find::name !~ /$filespec/ || (-d $File::Find::name) +; print $File::Find::name.$/; } sub globber{ my @files; foreach(@_){ push(@files, $_) unless $dirskip{$File::Find::name}; } return @files; }

      cheers,

      J

        Thank you- that is cool! At one point I had a print statement going in the sub - but totally missed it somehow.. thanks for the clarification!!

        I'm throwing "preprocess" out the window now..