Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
To traverse a directory tree and do stuff with some or all of the data files therein, this method works very fast, takes up very little memory, and is a relatively easy framework for handling lots of jobs of this ilk. It involves using the standard unix "find" utility (which has been ported for ms-windows users, of course).
# assume you have a $toppath, which is where the traversal starts chdir $toppath or die "can't cd to $toppath: $!"; open( FIND, "find . -type d -print0 |" ) or die "can't run find: $!"; # find will traverse downward from current directory # (ie. $toppath), and because of the "-type d" option, # will only list the paths of directories contained here; # the "-print0" (thanks, etcshadow) sets a null byte as the # string terminator for each file name (don't rely on "\n", # which could be part of a file name). { local $/ = "\x0"; # added thanks to etcshadow's reply while ( my $dir = <FIND> ) { chomp $dir; unless ( opendir( DIR, $dir )) { warn "$toppath/$dir: opendir failed: $!\n"; next; } while ( my $file = readdir( DIR )) { next if ( -d "$dir/$file" ); # outer while loop will handle al +l dirs # do what needs to be done with data files } closedir DIR; # anything else we need to do regarding this directory } } close FIND;

Comments:

The nice thing about this approach is that the "find" utility is very good with the recursive descent into subdirectories, and that's all it needs to do. Meanwhile, perl is very good with reading directory contents and manipulating data files, and it's really easy to do this when you're just working with data files in one directory at a time. Here, Perl can just skip over any subdirectories that it sees, because the output from "find" will bring those up for treatment in due course.

(update: made minor adjustments to comments in the code, added "closedir"; also wanted to point out that the loop over files could be moderated by using "grep ... readdir(DIR)", etc.)


In reply to An alternative to File::Find by graff

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-04-18 09:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found