Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

File::Find exit

by sweetblood (Prior)
on Oct 29, 2003 at 15:55 UTC ( [id://303024]=perlquestion: print w/replies, xml ) Need Help??

sweetblood has asked for the wisdom of the Perl Monks concerning the following question:

I hesitate asking this because it seems like it must be rudementary, however I read the perldoc, camel, and searched here and still have not found the answer. If I've missed in my searching please be gentle. I need a way to exit out of a file find. something like (this does not do what I want):

sub foo {last if (/^d/i); push @nod, $_} find (\&foo, "/somedir");

The above code pushes all filenames in the /somedir directory except the files beginning with 'd|D'. What I need is just the files that come before the files beginning with 'd|D'.

TIA

Update:

I tried to give an example that simplified what I need, I guess I simplified too much. I'll try to elaborate.

The process I'm creating needs to iterate through a given directory and any sub directories recursivley. As it goes through these files it needs to open each regular file it finds and evaluate the contents of the file to determine the file type (ascii data, binary data, xml, html, empty, etc) based on several specific rules I need to abandon the find and reset, then move to the next iteration of the outer loop (not seen in above example). The more I think about this, and the more I read from this thread, the more I believe that File::Find may not be the best mechanism for recursivley iterating through the files(in this case). But I'm not sure of how to do it any other way without risking the pitfalls of recursion.
If anyone has suggestions I'd be greatful.

Thanks Again!

Replies are listed 'Best First'.
Re: File::Find exit
by broquaint (Abbot) on Oct 29, 2003 at 16:01 UTC
    Simplest option will be to use the preprocess argument e.g
    find({ wanted => \&foo, preprocess => sub { grep !/^d/i, @_ }, }, '/somedir');
    See. the File::Find docs for more info.

    Update: as tye has pointed out, the above doesn't really answer your question. Since File::Find wasn't designed to allow stopping mid-process, File::Find::Rule is probably a better option e.g

    use File::Find::Rule; my $seen = 1; my @files = find( file => exec => sub { $seen = 0 if /^d/i; $seen }, in => '/somedir' );
    Or if you really want to stop iterating once you've seen the first file beginning with a 'd'
    my @files; my $rule = rule(file => start => '/somedir'); while(my $file = $rule->match) { last if $file =~ /^d/i; push @files => $file; }
    HTH

    _________
    broquaint

      Personally I think that he needs to explain more, as what he meant by that "before" in his last section. Without a clear understanding of that, no way we can give a correct answer. Well one answer might happen to be the right answer, but that's not the scientific way.

      For example, he might mean all the files start with [a-cA-C]. But this fails to consider chars other than alpha's.

      He might mean sort all filenames in dictionary order, and only grab those comes before d|D. If this is the case, one have to be caeful that all UPPER case chars come before lower case 'd' if order by ascii ord(). In this case, below code might be something he wants:

      use File::Find; use Data::Dumper; use strict; use warnings; my @nod; print ord('d'), "\n"; print ord('D'), "\n"; find({ wanted => \&foo }, '.'); print Dumper(\@nod); sub foo {push @nod, $_ if (lc($_) lt "d")}
Re: File::Find exit
by tachyon (Chancellor) on Oct 29, 2003 at 19:24 UTC

    This code does a width first traversal without recursion and returns some results you may find easier to work with. All it does is iterate over a list of dirs. Starting at the root dir it just keeps pushing the subdirs it finds onto the end of this list, thus the width first traversal.

    # recurses the directory tree with the root as the arg # returns paths relative to root -> not absolute paths # does not return root dir itself in @dirs list # returns references \@dirs and \@files and \%tree sub recurse_tree { my $root = shift; my @dirs = ( $root ); my @files; my %tree; for my $dir ( @dirs ) { opendir DIR, $dir or die("Can't open $dir\n"); # could just ne +xt (my $rel_dir = $dir) =~ s!^\Q$root\E/?!!; while ( my $file = readdir DIR ) { next if $file eq '.' or $file eq '..'; next if $file =~ m/^_/; # skip _ prefix files and dirs next if -l "$dir/$file"; # don't follow sym links push @dirs, "$dir/$file" if -d "$dir/$file"; push @files, "$dir/$file" if -f "$dir/$file"; push @{$tree{$rel_dir}}, $file if -f "$dir/$file" and $fil +e =~ m/\.html?$/; } closedir DIR; } # make paths relative to $root, comment out for full path @dirs = grep { $_ and ! m!^/$! } map{ s!^\Q$root\E/?!!; $_ } @dirs +; @files = map{ s!^\Q$root\E/?!!; $_ } @files; return \@dirs, \@files, \%tree; }

    cheers

    tachyon

Re: File::Find exit
by Anonymous Monk on Oct 29, 2003 at 17:07 UTC

    What is this "before" that you speak of? The order of entries in a directory mayn't be the order that you think. It's best not to rely on a particular ordering and impose your own.

    That said, it would be a nice if there were a mechanism to immediately stop the recursion of File::Find::find(). You can achieve that the poor man's way though:

    my $stop_the_insanity = 0; sub wanted { return if $stop_the_insanity; ... # some condition that sets $stop_the_insanity ... } find(\&wanted, @paths);
      If you don't set "bydepth", you can stop the recursion by setting $File::Find::prune. So ...
      my $stop_the_insanity = 0; sub wanted { $File::Find::prune = $stop_the_insanity; return if $stop_the_insanity; ... # some condition that sets $stop_the_insanity ... }

      bluto

Re: File::Find exit
by l2kashe (Deacon) on Oct 29, 2003 at 17:17 UTC

    I am assuming you are talking about the files in said dir being sorted alphabetically. With that you could alter the push to something along the lines of

    Update: I've been thinking about my solution, and I think the latter one is more appropriate, as opposed to the first. Im sure a better search criterion to File::Find would be more efficient way to go about doing this.

    push(@nod, $_) if m/^[a-cA-C]/;

    return unless m/^[a-cA-C]/; push(@nod, $_);

    use perl;

Re: File::Find exit
by bart (Canon) on Oct 29, 2003 at 17:25 UTC
    You exit a sub through last? I'm shocked. I'm even more shocked nobody here makes even bothers to make a remark about it.

    You should never ever leave a sub through last or next. I'd prefer it if Perl considered this a fatal error, either at compile time (like an "else" without "if"), or at runtime.

      Valid point but you failed to provide a correct way for leaving the sub. I'm no expert but i leave subs using return. Also I think that his code was psuedo code meant to express his desired outcome rather than actual code in need of work.

      ___________
      Eric Hodges
        The proper way is using return. So you are doing it the proper way.
      This is a runtime error in Perl 5.6.1 at least.
      perl -e "sub x{last} x" Can't "last" outside a loop block at -e line 1.
      Update: That was naive. See Bart's followup which corrects/expands this.
        Eh... not always.
        #!/perl/bin/perl -wl sub x { last } for(1 .. 3) { print; x(); } print "done";
        Result:
        Exiting subroutine via last at test.pl line 2.
        1
        done
        
        Here, it's a warning, not a fatal error.
      Then you may like to make it a runtime error:
      #!/usr/bin/perl -l use warnings; use strict; use warnings FATAL => qw/exiting/; # make it a runtime error sub xxx { next } no warnings; for ( 1 ... 3 ) { print; xxx; } print "done";
      So little time so much perldoc.

      (Thanks bart.)

Re: File::Find exit
by Roger (Parson) on Oct 30, 2003 at 06:57 UTC
    File::Find will short-circuit, there is no need to change your code significantly. What you can do is to wrap the find calls in an evaluation block like below -
    my @nod; eval { sub foo {die if (/^p/i); push @nod, $_} find (\&foo, "./"); }; print "$_\n" for @nod;
    Note that I have replaced last in your example with die, which will cause the evaluation block to be short-circuited.

    My directory contains the following files:
    try.pl try3.pl try0.pl try4.pl try0.txt try9.pl try6.pl try7.pl try8.pl try10.pl try11.pl text.txt try12.pl try13.pl try14.pl try15.pl p01.pl p02.pl p03.pl p04.pl p05.pl p09.pl p10.pl p11.pl p12.pl try2.pl try1.pl webrobot.pl links.txt algorithm.pl mainfile.txt
    I want the code to short circuit as soon as it sees a file beginning with letter 'p'. And the output of the code is just as expected -
    try.pl try3.pl try0.pl try4.pl try0.txt try9.pl try6.pl try7.pl try8.pl try10.pl try11.pl text.txt try12.pl try13.pl try14.pl try15.pl
      This may well do it ... I'll get busy and try it out.
Re: File::Find exit
by u914 (Pilgrim) on Oct 29, 2003 at 22:53 UTC
    Hi there! I asked a pretty similar question here a while ago, and got some great answers.

    In short, i ended up using the preprocess option, as suggested elewhere in this thread.

Re: File::Find exit
by texmec (Novice) on Oct 30, 2003 at 15:56 UTC
    hi, perhaps you could use File::Find::Iterator :
    use File::Find::Iterator; my $find = File::Find::Iterator->create(dir => ["."], order => sub { $_[0] cmp $_[1] } ); while (my $f = $find->next) { last if $d =~ /^d/i; .... do some thing };
    order is not documented but it works !!!
Re: File::Find exit
by duff (Parson) on Oct 29, 2003 at 21:05 UTC

    File::Find looks perfect to me if you need to recurse through subdirectories. But you might consider only doing the find() only once if the contents of the directory isn't apt to change.

    Perhaps you should show the actual code in context (including the "outer loop not shown")

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://303024]
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-20 00:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found