more useful options | |
PerlMonks |
Re: Threads slurping a directory and processing before conclusionby Clarendon4 (Acolyte) |
on Aug 22, 2011 at 10:26 UTC ( [id://921618]=note: print w/replies, xml ) | Need Help?? |
> 3. previous attempts have hit major stability and > time snags, even at the prototyping stage due to the > sheer volume of files that make up a comprehensive sample I notice (based on the "F:/" pathname) that you're on Win32. You have a File::Find::find like recursive file processing part in your code. This is always going to be slower than necesary on Win32 when coded in Perl. Consider using/writing some C/XS that generates the file list and avoids all the unnecesary stat (-d !) calls by using FindNextFile(). Also consider using forks over threads. They're easier on Win32 than you might think. Take a look at qfind.c and peg in my CPAN directory for ideas: http://cpan.mirrors.uk2.net/authors/id/A/AD/ADAVIES/Try comparing the time taken for qfind to generate a file list compared to a pure Perl solution eg. c:\> perl -e "${^WIN32_SLOPPY_STAT}=1; use Time::HiRes; $start = Time::HiRes::time; open Q, 'qfind.exe |'; while (<Q>) {}; close Q; print 'Took ', (Time::HiRes::time - $start)" c:\> perl -e "${^WIN32_SLOPPY_STAT}=1; use Time::HiRes; use File::Find; $start = Time::HiRes::time; File::Find::find(sub { }, '.'); print 'Took ', (Time::HiRes::time - $start)" On my Perl source directory of ~10_000 files this is <0.3 sec vs 1.7 sec. I suspect on your 1.2 million files this gives a *considerable* speed up. Oh, and make sure you BEGIN { ${^WIN32_SLOPPY_STAT} = 1 }; at the top of your code! Good luck.
In Section
Seekers of Perl Wisdom
|
|