Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
You should define if you care to not skip any files in the case new files arrive while your program fails to run. If you just monitor the directory for changes and your program goes down for some reason, any files added will be ignored.

You should also think about what happens if a file for some reason gets to be processed twice. Would it be a bigger problem than some wasted resources? If yes, maybe you should think about having the parsing program keeping track of processed files and just use rsync to fetch them in your working directory. Or write a script to run after rsyncing the files in your working directory.

Here is a version of Discipulus' code below, not copying files but just keeping track and calling the xml processing script, expected to be called from a cron job:

## pseudo code: my %cache_of_already_read_files; my @xml; %cache_of_already_read_files = &load_cache_from_somewhere; if (not defined %cache_of_already_read_files) { # Load failed. # Do some assumptions here to have a starting point, for example: @xml = &get_xml_files_names_based_on_timestamp; # ... or just assume that this is the first run: #@xml = &get_xml_files_names; } else { @xml = &get_xml_files_names; }; foreach my $filename (@xml) { next if exists $cache_of_already_read_files{$filename}; $cache_of_already_read_files{$filename} = 'found at'.scalar (local +time(time)); &process_xml_file($filename); } &clean_cache_from_older_filenames(\%cache_of_already_read_files, \@xml +); &save_cache_somewhere(\%cache_of_already_read_files);

In reply to Re: Monitoring directory contents by Theodore
in thread Monitoring directory contents by bendir

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-29 13:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found