http://qs321.pair.com?node_id=689918

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello I have a train wreck of some code, basically what this patch work of code is suppose to do is search through a directory, each file, each folder, looking for those with sizes == 0,and output the file or folder path name, surprise its not working, im thinking because its just "slurping" the entire directory at once and not looping through each individual element of the directory, if there are any merciful monks who could give some advice I would appreciate it.
$source = $x; $File::Find::dont_use_nlink = 1; File::Find::find( sub { if (-f $File::Find::name) { find(sub { $size1 += -s if -f $_ }, $File::Find::name); if ($size1 == 0) { print $File::Find::name;} } elsif (-d $File::Find::name) { find(sub { $size2 += -s if -d $_ }, $File::Find::name) +; if ($size2 == 0) { print $File::Find::name;} }; }, $source );

Replies are listed 'Best First'.
Re: function help
by moritz (Cardinal) on Jun 03, 2008 at 15:57 UTC
    Why do you call find() from within the callback? find() will find all the files for you, just have the patience to wait until they are delivered to your callback ;-)
Re: function help
by starbolin (Hermit) on Jun 03, 2008 at 16:34 UTC

    moritz gives good advice but I will word it stronger: Don't call find from inside find! There is no need and you run the risk of clobbering your variables ( though this is fixed in new versions of find. )

    This:  $size1 += -s is wrong. Since $size1 wasn't declared with my, $size1 is not local to the subroutine and will only contain '0' once, being incremented each time your callback is called. There's no real need to set any variable, just do:  print $File::Find::name if not -s; inside your callback.


    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}
Re: function help
by pc88mxer (Vicar) on Jun 03, 2008 at 16:51 UTC
    As others have said, don't call find within find - it's not reentrant.

    Your first internal use of find can definitely be eliminated (since you're just calling find on a file) and replaced with the following simpler code:

    if (-f $_) { if (-s _ == 0) { print $File::Find::name, "\n"; } }
    For directories it appears you want to visit the directory twice - once when it is encountered and later when File::Find is done with all of its entries. To do this, look up the documentation on the postprocess option. It specifically says it is useful for summarizing a directory, such as calculating its disk usage.

    Not tested, but should illustrate the idea:

    my %usage; sub wanted { if (-f $_) { my $size = -s _; $usage{$File::Find::dir} += $size; if ($size == 0) { print $File::Find::name, "\n"; } } } sub postprocess { print "Usage of $File::Find::dir is $usage{$File::Find::dir}\n"; } find({wanted => \&wanted, postprocess => \&postprocess}, ...);