Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I am looking for either suggestions for improvement, or ways to use existing modules like File::Find::*.

I wrote a utility to search through file archives organized according to a direction/date/topic structure. I usually know which direction and topic to search, but the transaction may have been archived on a range of days. I wrote my own very limited File::Find (code below) in order to implement this search.

For instance, I am looking for a transaction we sent containing the string "12345678", I know it's for CustomerD, and I'm pretty sure we sent it this week, so it could be in:

outbound/20151027/CustomerD outbound/20151026/CustomerD ... outbound/20151021/CustomerD

outbound/2015nnnn/ will have many subdirectories, and some of them will have hundreds of files. As a result, if I can't supply the topic, I run the search in the background and work on something else. But if I can, the response is quick enough.

So why explore modules if I have a working solution? Learning what's in CPAN, and how to better use it, is to my benefit.

Source code:

#!/home/edi/perl/perl use strict; use warnings; use Getopt::Std; use Date::Calc qw/Today Add_Delta_Days/; getopts('ior:s:d:b:'); our ($opt_i, $opt_o, $opt_r, $opt_s, $opt_d, $opt_b); my ($mode, $search_regex, $days, $business_process); die "Usage: search_si_archive.pl [-[io]] (-s searchstring | -r regex) +[-d daysback] [-b bpname]\n" unless ($opt_s || $opt_r); $mode = 'inbound'; $mode = 'outbound' if ($opt_o); $search_regex = qr/$opt_r/ if ($opt_r); $search_regex = qr/\Q$opt_s\E/ if ($opt_s); $days = defined($opt_d) ? $opt_d : 7; if ($opt_b) { $business_process = '*' . $opt_b . '*' } else { $business_process = '*' } my ($year, $month, $day) = Today(); # for each day from today back $days days while ($days >= 0) { my ($y, $m, $d) = Add_Delta_Days($year, $month, $day, -$days--); my $datestring = sprintf("%d%02d%02d", $y, $m, $d); my $directory = sprintf("/edi_store/archive/%s/%s/%s",$mode,$dates +tring,$business_process); my @dirlist = grep { -d } glob($directory); foreach my $dir (@dirlist) { opendir DIR, $dir; search_file($dir, $_) for (grep { -f $dir . '/' . $_ } readdir + DIR); closedir DIR; } } sub search_file { my $fname = sprintf("%s/%s",@_); open my $fh, '<', $fname; while (<$fh>) { if (m/$search_regex/) { print "$fname\n"; last; } } close($fh); } __END__ =pod =head1 Search SI Archive Search through SI archive directories for a string or regex, restricte +d by age and/or BP. =head1 USAGE search_si_archive.pl -[io] -[sr STRING] [-d DAYS|7] [-b BPNAME] =over =item -i INBOUND - search will start in /edi_store/archive/inbound/ directory t +ree. If neither -i or -o is indicated, this will be the default. =item -o OUTBOUND - search will start in /edi_store/archive/outbound/ directory + tree. =item -s STRING SEARCH - files will be searched for this literal string. Either this or -r must be specified. =item -r STRING REGEX - files will be searched for this regular expression. Either this or -s must be specified. =item -d DAYS DAYS BACK - search will start in today's tree. If this value is specif +ied, the search will be repeated this number of times, moving backward in time one day + with each iteration. If today is Monday, 3 would search today, Sunday, Saturday, and Friday +. If no value is specified, it will search 7 days back. =item -b NAME BUSINESS PROCESS - only directories whose name contains this string wi +ll be searched. If no value is specified, all directories will be searched. =back =head1 Examples =over =item 1. search_si_archive.pl -i -s DEPOT -d 0 -b AS2 Files in subdirectories of /edi_store/archive/inbound/YYYYMMDD whose n +ame contains the string AS2 will be searched for the string DEPOT. =back =head1 Author Howard Parks
Dum Spiro Spero

In reply to Searching over multiple directories using unusual logic by GotToBTru

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-04-25 09:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found