Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

File::Find help

by colox (Sexton)
on Dec 02, 2017 at 09:11 UTC ( #1204707=perlquestion: print w/replies, xml ) Need Help??

colox has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I have a routine to recursively search for files in a directory & move the file to another folder if it satisfies a condition. Since I need to do recursive search, I thought of using File::Find::name. It works in getting all the list of files & in moving the files to the other folder. However, my problem now is that the subfolders from the source directory stays (& empty) & are not getting deleted. Is there a way to auto-delete those using the same module? If not, any suggestion how to do it in a "nice way"?

sub Get_Allist{ @files=(); my $inputdir = $InP; find(sub {push @files,$File::Find::name if (-f $File::Find::name a +nd /\.*$/); }, $inputdir); } sub Move_Files{ foreach $srcfile (@files){ #Process each file in input folder move("$srcfile", "$OutP\\."); } }

Replies are listed 'Best First'.
Re: File::Find help
by Corion (Pope) on Dec 02, 2017 at 09:48 UTC

    In your code, you are checking whether you have a file (-f). Maybe you want to do something there if you have a directory (-d)?

    Note that you get the directory likely before all the files, so you might need to remember that you want to unlink rmdir a directory after you've successfully moved all the files.

    Update Laurent_R noted the difference between unlink and rmdir.

      Hi, Thank you for the reply. My dilemma is that the file (-f) could be way under multiple subdirectories. How can I effectively capture those so I can pass that to a variable to later do the unlink?

        Whenever you find a directory, remember it to be deleted afterwards?

        my @to_be_deleted; File::Find( ..., sub { if( -d $File::Find::name ) { print "'$File::Find::name' should be deleted later\n"; # push @to_be_deleted, $File::Find::name; }; }; rmdir @to_be_deleted; # boom
Re: File::Find help
by 1nickt (Abbot) on Dec 02, 2017 at 14:11 UTC

    Hi, You could use the more powerful, more user-friendly Path::Iterator::Rule for iterating through directory trees and Path::Tiny for file handling. Note that you can pass an anonymous subroutine as the value of the visitor option to the iterator call, to handle the matching files as they are found.

    In the following demonstration we want to move into B/ any files with .txt extension and no contents, found in any subdirectory of A/, then delete the subdirectory if empty. (You can of course use more meaningful conditions, e.g. test for file size, or age, or even matching contents. See the doc. Also see SEE ALSO in the doc for a discussion of the various tools that can be used for this task.)

    use strict; use warnings; use feature 'say'; use Path::Iterator::Rule; use Path::Tiny; my $root = './1204707'; my $in = $root . '/A'; my $to = $root . '/B'; my $rule = Path::Iterator::Rule->new; $rule->file->name( qr/txt$/ ); $rule->file->empty; my $next = $rule->iter( $in, { depthfirst => 1, visitor => sub { my $path = path( shift ); my $parent = $path->parent; $path->move( $to . '/' . $path->basename ); rmdir $parent if not $parent->children; }, }); while ( defined( my $file = $next->() ) ) { say "processing $file"; } __END__

    First set up the test files. We expect after running the program that there will be five empty text files in B/, and that A/bb (and subdir) will have been removed.

    $ ls -goR 1204707 # File size here # | # V 1204707: total 8 drwxrwxr-x 5 4096 Dec 2 08:56 A drwxrwxr-x 2 4096 Dec 2 08:56 B 1204707/A: total 12 drwxrwxr-x 2 4096 Dec 2 08:56 aa drwxrwxr-x 3 4096 Dec 2 08:56 bb drwxrwxr-x 2 4096 Dec 2 08:56 cc 1204707/A/aa: total 20 -rw-rw-r-- 1 8 Dec 2 08:56 file1.txt -rw-rw-r-- 1 0 Dec 2 08:56 file2.txt 1204707/A/bb: total 20 drwxrwxr-x 2 4096 Dec 2 08:56 aaa -rw-rw-r-- 1 0 Dec 2 08:56 file3.txt -rw-rw-r-- 1 0 Dec 2 08:56 file4.txt 1204707/A/bb/aaa: total 8 -rw-rw-r-- 1 0 Dec 2 08:56 file7.txt 1204707/A/cc: total 20 -rw-rw-r-- 1 0 Dec 2 08:56 file5.txt -rw-rw-r-- 1 8 Dec 2 08:56 file6.dat 1204707/B: total 0

    Run the program:

    $ perl 1204707.pl processing ./1204707/A/aa/file2.txt processing ./1204707/A/bb/aaa/file7.txt processing ./1204707/A/bb/file3.txt processing ./1204707/A/bb/file4.txt processing ./1204707/A/cc/file5.txt

    Check the directory tree after running the program:

    ls -goR 1204707 1204707: total 8 drwxrwxr-x 4 4096 Dec 2 08:57 A drwxrwxr-x 2 4096 Dec 2 08:57 B 1204707/A: total 8 drwxrwxr-x 2 4096 Dec 2 08:57 aa drwxrwxr-x 2 4096 Dec 2 08:57 cc 1204707/A/aa: total 12 -rw-rw-r-- 1 8 Dec 2 08:56 file1.txt 1204707/A/cc: total 12 -rw-rw-r-- 1 8 Dec 2 08:56 file6.dat 1204707/B: total 40 -rw-rw-r-- 1 0 Dec 2 08:56 file2.txt -rw-rw-r-- 1 0 Dec 2 08:56 file3.txt -rw-rw-r-- 1 0 Dec 2 08:56 file4.txt -rw-rw-r-- 1 0 Dec 2 08:56 file5.txt -rw-rw-r-- 1 0 Dec 2 08:56 file7.txt

    Hope this helps!


    The way forward always starts with a minimal test.

      hi those are some cool stuff!!!... thanks!... i quickly tried but some behavior is not what i wanted to occur: 1. the input directory is also deleted; i only wanted to remove the subdirectory 2. the input directory is also created into the output directory; i just need to create the subdirs & the files in it any other recommendation is much appreciated.

        Hi, You just need to get your rules right and your visitor sub right. Read the Path::Iterator::Rule doc for the rules, and post your code here including the rules and your handler sub and we'll get it straight!


        The way forward always starts with a minimal test.
Re: File::Find help
by tybalt89 (Prior) on Dec 02, 2017 at 15:02 UTC
    use File::Find; finddepth sub { -d $_ and rmdir $_ }, "yourtopoftree";

    should work to remove empty directories.

Re: File::Find help
by Anonymous Monk on Dec 02, 2017 at 10:03 UTC
Re: File::Find help
by Anonymous Monk on Dec 02, 2017 at 15:16 UTC
    Your logic appears to me to be sound ... so far. After moving the files, simply loop back through the same list a second time and try to remove each directory in turn. (If the directory is not empty, the operation should fail, and you can at that point quit trying and move on to the next name.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1204707]
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2020-10-25 02:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (248 votes). Check out past polls.

    Notices?