Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Missing files & File::Find.pm

by sbas013 (Initiate)
on Feb 07, 2007 at 12:49 UTC ( [id://598748]=perlquestion: print w/replies, xml ) Need Help??

sbas013 has asked for the wisdom of the Perl Monks concerning the following question:

I have a problem~ette with File::Find. OS: MS Windows Server 2003 Enterprose x64 bit
Perl: v5.5.8 MSWin32-x64-multi-thread
File System: NTFS (952gb)
Q. I'm using File::Find quite happily until... my script (which works perfectly on other servers) stops prematurely for no apparent reason part way through the drive being examined. I picked this up because Windows Properties for e:\ told me there was 208GB used file space & I was only picking up some 30GB.

NB For files > 2GB I get the info using dos “DIR”, apologies for that?

Some of the file names encountered on the drive frankly bizarre!

I tried:-

(1) Running the script against the largest folder in the drive and again I was only seeing 178,610 files instead of the 252,115 files that IE told me were present.

(2) A small scripts that just counts the number of files in the projects folder found 204,257.

Both scripts are consistent in the number of files they find.

I get several warnings that "Can't cd to e:..." The filename displayed is always 293 characters in length.

Hopefully this is one for Sybil Fawlty who’s specialist subject is, “the bleeding obvious” but I just can’t see it. My Perl Cookbook makes no mention of large filenames/handles.

I have spent some time looking for a solution with no luck.

Any advice would be gratefully received.

Thanks,

Simon<BR

Replies are listed 'Best First'.
Re: Missing files & File::Find.pm
by Thelonius (Priest) on Feb 07, 2007 at 17:16 UTC
    Looking at the source code, it appears that the Windows Platform SDK defines PATH_MAX as 260, and that is more than 260 bytes long is going to cause problems.

    The chdir() in the Microsoft C run-time always does a GetCurrentDirectory() after a SetCurrentDirectory() and it will fail if the absolute path name is more than 260 bytes, even if no single directory name is that large.

    There might also be a bug in Perl or the C run-time so that a path name that large overwrites a buffer causing the file names to appear very weird (this is just speculation).

    You might be able to get File::Find to work using the "no_chdir" option-- see the File::Find documentation.

      Oh no - RTFM! I was a bit sceptical that I would get ANY replies thinking File::Find was just too boring for someone to show any interest. That's put me in my place.
      I will try the no_chdir option & see how I get on. If this doesn't work I will be forced to have a look at wsh / vb, what a shame!
      Thanks for the pointer, very much appreciated.
      Cheers,
      Simon

      Note that you can circumvent the PATH_MAX limit by using UNC filenames, as discussed in the CreateFile documentation by Microsoft. Some short testing shows that not changing the directory seems to work:

      perl -MFile::Find -le "for $d (@ARGV) {print qq(Scanning $d);find({no_ +chdir=>1, wanted=>sub{print $File::Find::name}},$d)}" \\?\Q:\ Q:\

      Without the no_chdir option, scanning the first entry (\\?\Q:\) fails immediately.

Re: Missing files & File::Find.pm
by MonkE (Hermit) on Feb 07, 2007 at 14:00 UTC
    Maybe if we could see the script, we could offer some insight. Please note that if you do modify your post to show some code, be sure to enclose your code in tags (ie. <code>perl code goes here</code>). The code tags make it look nice.

    Since you're new to PerlMonks, you may want to take a look at these fine nodes that offer insight into posting effectively on PerlMonks:

  • How (Not) To Ask A Question]
  • How do I post a question effectively?
      I'm sure it show's it my first attempt at posting a question. I will read the advice on asking and posting questions. I certainly don't intend to post too many questions, I have been using Perl for some years and this is only the second time I have been stumped.

      I have spent some hours looking on the internet but I couldn't find anything that even hinted at problems with File::Find. In one respect it's a good thing, points to dodgy coding. The sad thing is (A) it's difficult to get wrong, (B) The script works perfectly on two other servers (non 64 bit).

      This script amounts to 500 lines a great percentage of which are comments. The bare essentials:-

      use File::stat;<BR> use File::Find;<BR> use Getopt::std;<BR> use Time::Local;<BR> use DateTime.pm; # Home grown<BR> use Extras; # Ditto<BR> ... find (\&Process_Directory, $TreeTop);<BR> ...<BR> ...<BR> # Close Report Files etc. exit;<BR> sub Process_Directory { $File = $File::Find::name; $Last_Modified = $Inode->mtime; $Last_Accessed = $Inode->atime; $Size = $Inode->size; ${@{$Year_Prof[$File_Age_Yrs]}}[1] += 1; ${@{$Year_Prof[$File_Age_Yrs]}}[2] += $Size; printf F_REP ("%02d/....,..."; <BR> if ( $EXCEL ) { print F_EXC "..."; } return; }
      From the above info I work out the age of the file in years and store the info in an array, noting the cumulative file sizes and file counts. The script produces a couple of reports, plain text and psv (Pipe'|' separated variable) for Excel. The idea is that we can target files of a certain age for archiving.

      I get the feeling that it may be a cummulative type of error, I need to clear something down OR it's just the bizarre directories and file names. The full path name on some files can run into hundreds of character, java~esk...

      Apologies for being a pest I was hoping it would be something obvious, use a different package or something.

      Thanks for quick response.

      Regards,

      Simon
      PS Puting the sub beneath instead of before the call probably shows my age...
Re: Missing files & File::Find.pm
by zentara (Archbishop) on Feb 07, 2007 at 14:20 UTC
    Some of the file names encountered on the drive frankly bizarre!

    Since you are talking Microsoft Windows, are you sure you don't have a virus on that server?


    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
      It's a production server and I/we would be horified if there were a virus, we update McAfee daily.

      When I use IE I can eventually drill down and find a valid file / document. Some of the files are over 10 years old. I encountered one file where the perl $Inode->mtime was -1. There are also lots of files where IE itself tells you the file was last touched in year 2048. Sounds odd now that you have mentioned virus'es, scary!

      As far I am aware, not being a Windows person/specialist, we have no viruses on that server.

      Thanks for the reply.

      Regards,

      Simon
Re: Missing files & File::Find.pm
by graff (Chancellor) on Feb 08, 2007 at 05:29 UTC
    Just a couple shots in the dark here (I'm pretty uninformed about MS-Windows systems)...

    Would you be able to install a "unix-tools-for-windows" package that would be suitable to the setup on that machine? (Might have to compile from sources...) That would include a Windows version of the unix "find" command, which might work better than perl's File::Find module. (Or it might not work at all, given the file quantities and path lengths you're talking about.) It's worth a try, I think, esp. since it's really easy to use "find" from within a perl script:

    my $basepath = "C:/where/to/start"; open ( FIND, "-|", "find $basepath -print0 ..." ) or die "can't start +find: $!"; $/ = chr(0); while (<FIND>) { chomp; # $_ is a path name that can be used with stat, etc }
    (update: I originally had the mode string wrong in the open() statement -- fixed it to "-|".)

    If you're not familiar with the unix "find" command, it supports a vast range of option flags (to be added where I put "..." above). For instance, one thing that I found to be handy is to use "find" to get all the directories under a given top-level path, and then loop over those, using glob or readdir to do something with all the file names in each directory.

      Nice idea, me being a Unix Sys Admin, be right up my street. However the light side just would not appreciated me loading a Unix Tool on their Windows Servers, I might run cygwin past them and see what they say. Having said that I got Perl & gvim past them so there is hope.
      Cheers,
      Simon

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://598748]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-20 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found