Pathologically Eclectic Rubbish Lister | |
PerlMonks |
Re: scalable duplicate file removerby jwkrahn (Abbot) |
on Mar 03, 2008 at 08:20 UTC ( [id://671599]=note: print w/replies, xml ) | Need Help?? |
sub process_file { my $dir_configs=$_[0]; ##optimisation using -d -l -f -s just once for return and also for adding #if current "file"(unix terminology) is a directory and the yaml configuration #tells us to eliminate directories from the search we do so by returning from the #callback return if -d $File::Find::name && ! $dir_configs->{dir}; You call stat on the file. return if -l $File::Find::name && ! $dir_configs->{link}; You call lstat on the same file. return if -f $File::Find::name && ! $dir_configs->{file}; You call stat on the same file again. return if -s $File::Find::name < $config->{minsize}; You call stat on the same file again. unless($File::Find::name =~ /$dir_configs->{regex}/) { if(-d $File::Find::name) { You call stat on the same file again. $File::Find::prune=1; } return; } my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size, $atime,$mtime,$ctime,$blksize,$blocks) = stat($File::Find::name); You call stat on the same file again. You declare 13 variables but you are only using one. my $last_modif_time=DateTime->from_epoch(epoch=>$mtime); # printf "%s %s %s %s\n", # $File::Find::name, # file2sha1($File::Find::name), # -s $File::Find::name, Commented out but if not you call stat on the same file again. # $last_modif_time; add_to_db(file2sha1($File::Find::name),$last_modif_time,-s $File::Find::name,$File::Find::name); You call stat on the same file again. You call add_to_db() which calls stat or lstat three more times. #print Dumper $dir_configs; }; In total you call stat or lstat ten times on the same file (eleven times if you uncomment the printf statement.) You also use $File::Find::name in most places where $_ would have the same effect.
In Section
Cool Uses for Perl
|
|