Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I am looking for either suggestions for improvement, or ways to use existing modules like File::Find::*.

I wrote a utility to search through file archives organized according to a direction/date/topic structure. I usually know which direction and topic to search, but the transaction may have been archived on a range of days. I wrote my own very limited File::Find (code below) in order to implement this search.

For instance, I am looking for a transaction we sent containing the string "12345678", I know it's for CustomerD, and I'm pretty sure we sent it this week, so it could be in:

outbound/20151027/CustomerD outbound/20151026/CustomerD ... outbound/20151021/CustomerD

outbound/2015nnnn/ will have many subdirectories, and some of them will have hundreds of files. As a result, if I can't supply the topic, I run the search in the background and work on something else. But if I can, the response is quick enough.

So why explore modules if I have a working solution? Learning what's in CPAN, and how to better use it, is to my benefit.

Source code:

#!/home/edi/perl/perl use strict; use warnings; use Getopt::Std; use Date::Calc qw/Today Add_Delta_Days/; getopts('ior:s:d:b:'); our ($opt_i, $opt_o, $opt_r, $opt_s, $opt_d, $opt_b); my ($mode, $search_regex, $days, $business_process); die "Usage: search_si_archive.pl [-[io]] (-s searchstring | -r regex) +[-d daysback] [-b bpname]\n" unless ($opt_s || $opt_r); $mode = 'inbound'; $mode = 'outbound' if ($opt_o); $search_regex = qr/$opt_r/ if ($opt_r); $search_regex = qr/\Q$opt_s\E/ if ($opt_s); $days = defined($opt_d) ? $opt_d : 7; if ($opt_b) { $business_process = '*' . $opt_b . '*' } else { $business_process = '*' } my ($year, $month, $day) = Today(); # for each day from today back $days days while ($days >= 0) { my ($y, $m, $d) = Add_Delta_Days($year, $month, $day, -$days--); my $datestring = sprintf("%d%02d%02d", $y, $m, $d); my $directory = sprintf("/edi_store/archive/%s/%s/%s",$mode,$dates +tring,$business_process); my @dirlist = grep { -d } glob($directory); foreach my $dir (@dirlist) { opendir DIR, $dir; search_file($dir, $_) for (grep { -f $dir . '/' . $_ } readdir + DIR); closedir DIR; } } sub search_file { my $fname = sprintf("%s/%s",@_); open my $fh, '<', $fname; while (<$fh>) { if (m/$search_regex/) { print "$fname\n"; last; } } close($fh); } __END__ =pod =head1 Search SI Archive Search through SI archive directories for a string or regex, restricte +d by age and/or BP. =head1 USAGE search_si_archive.pl -[io] -[sr STRING] [-d DAYS|7] [-b BPNAME] =over =item -i INBOUND - search will start in /edi_store/archive/inbound/ directory t +ree. If neither -i or -o is indicated, this will be the default. =item -o OUTBOUND - search will start in /edi_store/archive/outbound/ directory + tree. =item -s STRING SEARCH - files will be searched for this literal string. Either this or -r must be specified. =item -r STRING REGEX - files will be searched for this regular expression. Either this or -s must be specified. =item -d DAYS DAYS BACK - search will start in today's tree. If this value is specif +ied, the search will be repeated this number of times, moving backward in time one day + with each iteration. If today is Monday, 3 would search today, Sunday, Saturday, and Friday +. If no value is specified, it will search 7 days back. =item -b NAME BUSINESS PROCESS - only directories whose name contains this string wi +ll be searched. If no value is specified, all directories will be searched. =back =head1 Examples =over =item 1. search_si_archive.pl -i -s DEPOT -d 0 -b AS2 Files in subdirectories of /edi_store/archive/inbound/YYYYMMDD whose n +ame contains the string AS2 will be searched for the string DEPOT. =back =head1 Author Howard Parks
Dum Spiro Spero

In reply to Searching over multiple directories using unusual logic by GotToBTru

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chanting in the Monastery: (7)
    As of 2020-09-18 18:07 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      If at first I donít succeed, I Ö










      Results (113 votes). Check out past polls.

      Notices?