http://qs321.pair.com?node_id=1146145

GotToBTru has asked for the wisdom of the Perl Monks concerning the following question:

I am looking for either suggestions for improvement, or ways to use existing modules like File::Find::*.

I wrote a utility to search through file archives organized according to a direction/date/topic structure. I usually know which direction and topic to search, but the transaction may have been archived on a range of days. I wrote my own very limited File::Find (code below) in order to implement this search.

For instance, I am looking for a transaction we sent containing the string "12345678", I know it's for CustomerD, and I'm pretty sure we sent it this week, so it could be in:

outbound/20151027/CustomerD outbound/20151026/CustomerD ... outbound/20151021/CustomerD

outbound/2015nnnn/ will have many subdirectories, and some of them will have hundreds of files. As a result, if I can't supply the topic, I run the search in the background and work on something else. But if I can, the response is quick enough.

So why explore modules if I have a working solution? Learning what's in CPAN, and how to better use it, is to my benefit.

Source code:

#!/home/edi/perl/perl use strict; use warnings; use Getopt::Std; use Date::Calc qw/Today Add_Delta_Days/; getopts('ior:s:d:b:'); our ($opt_i, $opt_o, $opt_r, $opt_s, $opt_d, $opt_b); my ($mode, $search_regex, $days, $business_process); die "Usage: search_si_archive.pl [-[io]] (-s searchstring | -r regex) +[-d daysback] [-b bpname]\n" unless ($opt_s || $opt_r); $mode = 'inbound'; $mode = 'outbound' if ($opt_o); $search_regex = qr/$opt_r/ if ($opt_r); $search_regex = qr/\Q$opt_s\E/ if ($opt_s); $days = defined($opt_d) ? $opt_d : 7; if ($opt_b) { $business_process = '*' . $opt_b . '*' } else { $business_process = '*' } my ($year, $month, $day) = Today(); # for each day from today back $days days while ($days >= 0) { my ($y, $m, $d) = Add_Delta_Days($year, $month, $day, -$days--); my $datestring = sprintf("%d%02d%02d", $y, $m, $d); my $directory = sprintf("/edi_store/archive/%s/%s/%s",$mode,$dates +tring,$business_process); my @dirlist = grep { -d } glob($directory); foreach my $dir (@dirlist) { opendir DIR, $dir; search_file($dir, $_) for (grep { -f $dir . '/' . $_ } readdir + DIR); closedir DIR; } } sub search_file { my $fname = sprintf("%s/%s",@_); open my $fh, '<', $fname; while (<$fh>) { if (m/$search_regex/) { print "$fname\n"; last; } } close($fh); } __END__ =pod =head1 Search SI Archive Search through SI archive directories for a string or regex, restricte +d by age and/or BP. =head1 USAGE search_si_archive.pl -[io] -[sr STRING] [-d DAYS|7] [-b BPNAME] =over =item -i INBOUND - search will start in /edi_store/archive/inbound/ directory t +ree. If neither -i or -o is indicated, this will be the default. =item -o OUTBOUND - search will start in /edi_store/archive/outbound/ directory + tree. =item -s STRING SEARCH - files will be searched for this literal string. Either this or -r must be specified. =item -r STRING REGEX - files will be searched for this regular expression. Either this or -s must be specified. =item -d DAYS DAYS BACK - search will start in today's tree. If this value is specif +ied, the search will be repeated this number of times, moving backward in time one day + with each iteration. If today is Monday, 3 would search today, Sunday, Saturday, and Friday +. If no value is specified, it will search 7 days back. =item -b NAME BUSINESS PROCESS - only directories whose name contains this string wi +ll be searched. If no value is specified, all directories will be searched. =back =head1 Examples =over =item 1. search_si_archive.pl -i -s DEPOT -d 0 -b AS2 Files in subdirectories of /edi_store/archive/inbound/YYYYMMDD whose n +ame contains the string AS2 will be searched for the string DEPOT. =back =head1 Author Howard Parks
Dum Spiro Spero