Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Monitoring directory contents

by Theodore (Friar)
on Apr 08, 2014 at 12:12 UTC ( [id://1081495]=note: print w/replies, xml ) Need Help??


in reply to Monitoring directory contents

You should define if you care to not skip any files in the case new files arrive while your program fails to run. If you just monitor the directory for changes and your program goes down for some reason, any files added will be ignored.

You should also think about what happens if a file for some reason gets to be processed twice. Would it be a bigger problem than some wasted resources? If yes, maybe you should think about having the parsing program keeping track of processed files and just use rsync to fetch them in your working directory. Or write a script to run after rsyncing the files in your working directory.

Here is a version of Discipulus' code below, not copying files but just keeping track and calling the xml processing script, expected to be called from a cron job:

## pseudo code: my %cache_of_already_read_files; my @xml; %cache_of_already_read_files = &load_cache_from_somewhere; if (not defined %cache_of_already_read_files) { # Load failed. # Do some assumptions here to have a starting point, for example: @xml = &get_xml_files_names_based_on_timestamp; # ... or just assume that this is the first run: #@xml = &get_xml_files_names; } else { @xml = &get_xml_files_names; }; foreach my $filename (@xml) { next if exists $cache_of_already_read_files{$filename}; $cache_of_already_read_files{$filename} = 'found at'.scalar (local +time(time)); &process_xml_file($filename); } &clean_cache_from_older_filenames(\%cache_of_already_read_files, \@xml +); &save_cache_somewhere(\%cache_of_already_read_files);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1081495]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-26 01:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found