Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Descending a directory tree, returning a list of files

by Rodster001 (Pilgrim)
on Jun 09, 2015 at 19:42 UTC ( [id://1129693]=perlquestion: print w/replies, xml ) Need Help??

Rodster001 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a very simple script here, but it is not working correctly. Once it descends a child directory, it does not return and continue to list the parent.
## cwd my $cwd = getcwd; ## Get files from cwd my $res = _list_files($cwd); sub _list_files { my $dir = shift; my $files = shift; opendir(DIR, $dir) || die "Error opening: $dir\n"; while (my $fn = readdir(DIR)) { ## skip hidden next if $fn =~ /^\./; ## file, fullpath name my $file = $dir . "/" . $fn; ## add push(@{ $files }, $file); ## dir, decend if (-d $file) { ## sub-directory files $files = _list_files($file, $files); } } closedir(DIR); return $files; }
Could you please enlighten the blind one? Thanks!

Replies are listed 'Best First'.
Re: Descending a directory tree, returning a list of files
by toolic (Bishop) on Jun 09, 2015 at 20:16 UTC

    Tip #1 from the Basic debugging checklist: warnings

    readdir() attempted on invalid dirhandle DIR at closedir() attempted on invalid dirhandle DIR at

    Using lexical filehandles seems to work for me:

    use warnings; use strict; use Cwd; ## cwd my $cwd = getcwd; ## Get files from cwd my $res = _list_files($cwd); sub _list_files { my $dir = shift; my $files = shift; opendir(my $dh, $dir) || die "Error opening: $dir\n"; while (my $fn = readdir($dh)) { ## skip hidden next if $fn =~ /^\./; ## file, fullpath name my $file = $dir . "/" . $fn; ## add push(@{ $files }, $file); ## dir, decend if (-d $file) { ## sub-directory files $files = _list_files($file, $files); } } closedir($dh); return $files; }

    See also:

      Using lexical filehandles seems to work for me

      Yes, for relatively "flat" directory structures. There is a limit for the number of open file/directory handles (OS specific), and if the directory structure is deep enough, you will run out of file handles. That will happen even earlier if you have a lot of open files.

      There is no need to keep the file/directory handle open while recursing directory structures, all you need is a single handle for arbitary nested directories:

      #!/usr/bin/perl use strict; use warnings; sub scan { my $dirname=shift; opendir my $d,$dirname or die "Can't open $dirname: $!"; my @files=readdir $d; closedir $d; for my $n (@files) { next if $n=~/^\.{1,2}$/; if (-d "$dirname/$n") { print "DIR: $dirname/$n\n"; scan("$dirname/$n"); } else { print "NOTDIR: $dirname/$n\n"; } } } scan(".");

      There even is no need to use recursion at all:

      #!/usr/bin/perl use strict; use warnings; sub scan { my $topdir=shift; my @todo=($topdir); while (@todo) { my $dirname=shift @todo; opendir my $d,$dirname or die "Can't open $dirname: $! +"; while (my $n=readdir $d) { next if $n=~/^\.{1,2}$/; if (-d "$dirname/$n") { print "DIR: $dirname/$n\n"; push @todo,"$dirname/$n"; } else { print "NOTDIR: $dirname/$n\n"; } } closedir $d; } } scan(".");

      This way, you don't need a (possibly big) array of directory entry names for each recursion level, but only a list of directory names not yet scanned. And because this is not recursive, you won't get the "Deep recursion on subroutine" warning with deeply nested directories (see perldiag).

      Try changing shift @todo to pop @todo for a different scan order.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Descending a directory tree, returning a list of files
by kennethk (Abbot) on Jun 09, 2015 at 20:18 UTC
    Since you've used a bareword directory handle, you clobber your existing handle every time you descend a level. Never use bareword handles for recursive calls. Problem solved if you use a lexical/indirect handle:
    use strict; use warnings; use Cwd 'getcwd'; ## cwd my $cwd = getcwd; ## Get files from cwd my $res = _list_files($cwd); sub _list_files { my $dir = shift; my $files = shift; opendir(my $dh, $dir) || die "Error opening: $dir\n"; while (my $fn = readdir($dh)) { ## skip hidden next if $fn =~ /^\./; ## file, fullpath name my $file = $dir . "/" . $fn; ## add push(@{ $files }, $file); ## dir, decend if (-d $file) { ## sub-directory files $files = _list_files($file, $files); } } #closedir($dh); # Not necessary; auto-closed on end of scope return $files; }
    See also How do I traverse a directory tree?

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      That was it. Bareword directory handle. For some reason I thought that was private to that sub routine. Thank you!
        Forgetting that bareword handles are global is a common issue, which is why (for me at least) Bareword Handles Considered Harmful.

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Descending a directory tree, returning a list of files
by Discipulus (Canon) on Jun 09, 2015 at 20:20 UTC
    You open a DIR handle but, as you encounter a folder you jump out of the process and reopen the same handle with another directory. it is not so logic. This was my first problem too using Perl (and we are in good company!) and this is also a Faq. Since that times i always use the tachyon's solution.

    Try adding some print statements to see how things go on in your flow.
    my $cwd = getcwd; print "CWD:$cwd\n"; ## Get files from cwd my $res = _list_files($cwd); sub _list_files { my $dir = shift; my $files = shift; print "GETTING:$dir\n"; opendir(DIR, $dir) or die "Error opening: $dir\n"; while (my $fn = readdir(DIR)) { ## skip hidden next if $fn =~ /^\./; ## file, fullpath name my $file = $dir . "/" . $fn; print "FILE:$file\n"; ## add push(@{ $files }, $file); ## dir, decend if (-d $file) { print "DIR:$file\n"; ## sub-directory files $files = _list_files($file, $files); } } closedir(DIR); return $files; }

    HtH
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Descending a directory tree, returning a list of files
by RonW (Parson) on Jun 09, 2015 at 20:07 UTC

    You sub, _list_files, is expecting 2 parameters. You are only passing one. Since _list_files treats the second parameter as a reference, you need to pass a reference:

    use strict; use warnings; my $cwd = getcwd; my @file_list; my $res = _list_files($cwd, \@file_list); # definition of _list_files($cwd omitted)
      That didn't solve the problem. If I comment out the line where it descends a sub-directory (and calls itself):
      ## dir, decend if (-d $file) { ## sub-directory files #$files = _list_files($file, $files); }
      It works. Well, it returns all the files in the cwd as expected. But when that line is not commented out, if it finds a sub directory, it never returns to finish listing the files in the parent (the one we started in).
Re: Descending a directory tree, returning a list of files
by TravelAddict (Acolyte) on Jun 09, 2015 at 20:47 UTC
    Hi, Why not use use File::Find::Rule to get the list of files? Here's how I get a list of XML files in all sub-directory starting from the current location:
    use File::Find::Rule; my $xml_finder = File::Find::Rule->new()->name(qr/(.*?)\.xml$/i)->star +t("."); while (my $file = $xml_finder->match() ) { # Do whatever you want with your file name }
    I hope this helps!

      Definitely agree ... “file finders” (“directory tree walkers”) are the way to go, and there are quite a few good ones in CPAN.   There’s just no point in doing such a mundane chore “by hand.”

        There’s just no point in doing such a mundane chore “by hand.”

        Unless you need to do it "by hand" because the modules on CPAN don't do what you need to do. Perhaps I missed it, but I didn't find any of the "file finders" modules (as you called them) that would handle long file paths (>260 characters) in Windows. Looking at possibly using Win32::LongPath to roll my own code to search through a Windows filesystem and handle the long path names.

        Until I hit this long path issue, I probably would have agreed with what you said.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1129693]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-24 23:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found