Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Finding un-paired files in a directory

by Corion (Patriarch)
on Dec 02, 2003 at 11:42 UTC ( #311586=note: print w/replies, xml ) Need Help??

in reply to Finding un-paired files in a directory

Personally, I would use a different approach by restating the problem: You are interested in all *.mrg files that have no corresponding *.did file :

opendir DIR, $dir or die "Couldn't open directory '$dir' : $!"; my @files = grep { /(.*)\.mrg$/ and not -f "$dir/$1.did" } readdir DIR +; closedir DIR;

My method might be a bit slower, as for each .mrg file, an additional call to stat will be made, which can be very slow on full directories, but I think that the shorter code makes up for the slower code. If speed should really become an issue, I'd readdir the directories contents into a hash and then check for existence in the hash much like your example:

opendir DIR, $dir or die "Couldn't read '$dir' : $!"; my @all_files = map { lc $_ } readdir DIR; closedir DIR; my %did = map { /(.*)\.did$/ and ($1 => 1) } grep { /\.did$/ } @all_ +files; my @files = grep { /(.*)\.mrg$/ and not $did{$1} } @all_files;

Also, I don't think that production code should contain references to Perlmonks node IDs, but rather an explanation of what happens :

# return a list or a reference to an array, depending # on what the caller wants: return wantarray ? @files : \@files;

Update: Added "faster" alternative

Update 2: Fixed code in response to merlyns bugfinding

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

Replies are listed 'Best First'.
•Re: Re: Finding un-paired files in a directory
by merlyn (Sage) on Dec 02, 2003 at 12:13 UTC
    opendir DIR, $dir or die "Couldn't open directory '$dir' : $!"; my @files = grep { /(.*)\.mrg$/ and not -f "$1.did" } readdir DIR; closedir DIR
    No, that's testing "-f" on a file in the current directory for a name that should be checked in a different directory. A traditional readdir mistake. {grin}

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Which is why globing can be so lovely, as it includes the file path for you :) I'm not sure if this is any faster than Abigail-II's solution. At least this does only check for the existence of each file once rather than two checks.

      my $dir = '/path/to/dir'; print join ", ", grep { /(.*)\.mrg\z/ and not -f "$1.did" } <$dir/*\.mrg>'

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://311586]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2023-10-04 04:19 GMT
Find Nodes?
    Voting Booth?

    No recent polls found