General perl question. Multiple servers.

dbmathis has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: General perl question. Multiple servers. by shmem (Chancellor) on Oct 06, 2007 at 15:01 UTC
I'd set up a syslog server and have each process send a UDP packet to this server after process completion. --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply]
Re^2: General perl question. Multiple servers. by dsheroh (Monsignor) on Oct 06, 2007 at 15:23 UTC
A syslog server is a very good (and, IME, very underused) solution. Alternately, if that's not sexy enough to get management buy-in, you could instead set the processes up to all log to a central database, but that would mostly be just pointless overhead unless you're using a database already (and may still be pointless overhead even if you are).	[reply]
Re^2: General perl question. Multiple servers. by graff (Chancellor) on Oct 06, 2007 at 15:11 UTC
++ Much better than my idea below, but there would need to be a reliable way to identify the cases where any of the 150 processes fail before they get to the point of sending their UDP packet to the log server. Not hard to handle, just easy to forget... update: On second thought, if the log data from each host is anything more than a single summary report printed at the end of each job, I would still kinda prefer my approach. If the jobs are printing progress reports at intervals, the entries submitted to a central syslog server will tend to be interleaved, and will need to be sorted out. Not a big deal, obviously, but it might be handier to have the stuff "pre-sorted" by harvesting from each machine.	[reply]
Re^3: General perl question. Multiple servers. by shmem (Chancellor) on Oct 06, 2007 at 18:03 UTC
there would need to be a reliable way to identify the cases where any of the 150 processes fail A 'job started' message could be sent by a wrapper which cares about the process and reports its exit status. the entries submitted to a central syslog server will tend to be interleaved syslog is configurable, and one could send the the log messages to different files based on level/facility and host. Anyways, the log line is marked with the host sending the log message, so sorting things out is as easy as grepping the log file for a host. --shmem _($_=" "x(1<<5)."?\n".q·/)Oo. G°\ / /\_¯/(q / ---------------------------- \__(m.====·.(_("always off the crowd"))."· ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}	[reply]
Re^2: General perl question. Multiple servers. by snopal (Pilgrim) on Oct 06, 2007 at 15:33 UTC
It is interesting, the different backgrounds from which we all come. I'm predominantly used to applying solutions to assets given, while other people come from backgrounds where adding a server here or there is considered trivial. It's good to get both perspectives.	[reply]
Re: General perl question. Multiple servers. by snopal (Pilgrim) on Oct 06, 2007 at 14:46 UTC
I recommend these options: P.O.E. - invoke remotely with clear responses Local net delivery - cron response is sent to central machine via scp/rsync NFS - copy response to central common directory space MTA - send responses to a common mail address ... limited by your imagination and resources	[reply]
Re: General perl question. Multiple servers. by graff (Chancellor) on Oct 06, 2007 at 15:04 UTC
Probably the easiest would be a cron job on one machine that takes a list of the machines that generate logs, harvests the log files from all of them into a daily directory, and then parses each log file in turn to do statistics and report details on "outliers". In effect, the perl script does mechanically what you are now doing manually (you can be replaced by a perl script ;). Of course, that assumes that the 150 machines are all doing the same thing, and their results are all stored using the same path/file.name on each machine. Either that, or else the list of machines to scan includes all the details needed to find the log file for each one. For doing the harvest, there's nothing inherently wrong about just running an appropriate command in back-ticks like my $log = `ssh $host cat /log/path/my_process.log` (this assumes you have public-key authentication in place, so the userid running this won't need to supply a password for each connection). If the overhead of launching a shell 150 times bothers you, you could do it like this: `use strict; use POSIX; my $today = strftime( "%Y%m%d", localtime ); mkdir "/path/harvest/$today"; chdir "/path/harvest/$today" or die $!; my @remote_machine_list = ... # (fill in the ...) my $shell_pid = open my $shell, "\|-", "/bin/sh" or die $!; print $shell "cd /path/harvest/$today\n"; print $shell "ssh $_ cat /log/path/my_process.log > $_.log 2> harvest. +errlog \|\|". " echo $_ failed 2> harvest.errlog\n" for ( @remote_machine_list ); print $shell "exit\n"; waitpid $shell_pid, 0;` [download] (updates: added check for success from chdir call, and added first print statement to chdir in the subshell.) You might need to add stuff to that, like setting SIG_ALARM in case ssh hangs on a given host. On each iteration, if the "ssh $_ ..." works, its output is stored locally in "$_.log", and any stderr output is appended to the local file "harvest.errlog". But if the ssh fails, a line about that is appended to "harvest.errlog" as well. But you also have a variety of CPAN modules in the Net::SSH domain that you might find preferable.	[reply] [d/l] [select]
Re^2: General perl question. Multiple servers. by dbmathis (Scribe) on Oct 06, 2007 at 23:18 UTC
This may work, I do have public-key authentication in place already. Thanks for the information. I will let you know what I end up doing. After all this is over, all that will really have mattered is how we treated each other.	[reply]
Re: General perl question. Multiple servers. by mwah (Hermit) on Oct 06, 2007 at 16:05 UTC
dbmathis: I have a group of about 150 linux application servers that a process runs on nightly and then a SUCCESS gets written to a logfile of each of the servers when the process completes. Currently I have to log into each server via ssh and grep each log to see if the process completed. YMMV, but I had (and have) to deal with a similar problem in a "computational chemistry" environment. The number of servers or nodes is about one half of yours. What I learned from all that: "keep it dead simple" try to get it installed OOTB -if possible. My current solution: 1. programs & logging - One of the (older) boxes poses as server and holds the node cluster in a subnet (a private one in my case) - The server exposes (NFS,SMB possible) its /usr/local/bin (ro-mode) and its /srv/cluster (rw-mode) to the subnet, - The nodes load their applications from the central mounted /usr/local/bin and write logs with date and ip (in filenames) into seperate files in /srv/cluster 2. job overview - The server has some perl scripts for job overview, if required, the number and respective ip's of running nodes are found by "nmapping" the subnet: `... # $addr is the actual subnet, e.g. "192.168.1.0" ... my $output = qx{nmap -sP ${addr}/24}; my @nodes= $output =~ /(?<=\s)c\w+\b/g;` [download] This (nmap -sP) will run very fast (at least here, from a non-root account) and may provide a "real time" info on running nodes per html page, eg.: `... print header('text/html'); print h1('Local Network: '. $addr . '/24'); print map "$_ appears to be up<br />", @hosts ...` [download] The found nodes might then be rsh'ed (if its a private subnet, you won't be killed for using rsh/rexec then) Pseudo: `... my ($exe, $cmd) = ('/usr/bin/rsh', 'ps -fl r -u username'); my $cnt = 0; for my $node ( sort @nodes ) { my @res = grep !(/$cmd/ \|\| /STIME/), split /[\n\r]+/, qx{$exe $node + $cmd}; my $nproc = scalar @res; # how many processes if( $nproc ) { print map "Do " . "some ". "formatting of " . "ps -fl output here!", @res } ... ++$cnt ...` [download] In the end, you'll have a browser-interface to the running processes (build a nice html table in the "map" above) and a central directory full of log files, which might even be exported (smb) to windows machines for coworker preferring the explorer ;-) The only "complication" (additional work per node) would be "installing and enabling the nfs client". my €0.02 regards mwa	[reply] [d/l] [select]
Re: General perl question. Multiple servers. by perlfan (Vicar) on Oct 06, 2007 at 15:27 UTC
Pushing some sort of confirmation out to a single master would be the easiest thing to do (using ssh passwordless auth), but this would actually be neat situation to implement some sort of distributed confirmation algorithm that doesn't have a single point of failure - that being the connection between your master and any of the servers you'd like to keep tabs on. If you have some time to think about the solution and a lot more time to code it, POE might be the answer, but in reality a simple master daemon listener solution would not take long to craft from the many Perl daemon examples out there. You could use a simple scheme where there is the master listener waiting for "DONE_SUCCESS" from all respective servers. On hearing "DONE_SUCCESS", the master daemon would send an ACK back to the server in question (requires a listener on that end, too). The server trying to check in would continue to send "DONE_SUCCESS" messages until it finally hears the "ACK" from the master. If you're interested in a solution like that, let me know and I'll go over it in more detail.	[reply]
Re^2: General perl question. Multiple servers. by dbmathis (Scribe) on Oct 06, 2007 at 23:28 UTC
Hi perlfan, The master daemon idea is somewhat attractive because I could apply this same technique to several other projects. I would be delighted to hear more details. Best Regards After all this is over, all that will really have mattered is how we treated each other.	[reply]
Re: General perl question. Multiple servers. by jethro (Monsignor) on Oct 07, 2007 at 01:49 UTC
If you want reliable, then don't try to programm the network code yourself, use established methods. So the syslog suggestion from shmem is the most simple, most economic solution you can find if your applications already use syslog for logging or can be persuaded to do so. Syslog takes care of the network transmission and all you have to do is to parse the local log (the code for that would easily fit in one line). If that is not possible, a (perl-)script that uses ssh to poll all the servers is IMHO already a very reliable solution (use ssh parameter "-o ConnectTimeout=4", so that ssh doesn't wait so long for offline servers) and be sure to check for success of the ssh.	[reply]
Re: General perl question. Multiple servers. by casiano (Pilgrim) on Oct 07, 2007 at 12:12 UTC
Probaly the module GRID::Machine can help if you decide to make automatic what you are now doing by hand: "I have to log into each server via ssh and grep each log to see if the process completed"	[reply]
Re: General perl question. Multiple servers. by mattr (Curate) on Oct 09, 2007 at 04:56 UTC
Syslog seems to be a good idea. It doesn't mean set up a new hardware box, just a daemon. Both the following links note you want to set your clocks with NTP in case you don't yet. Note that UDP mentioned above does not guarantee delivery especially if your servers are distant. oreilly.com syslog.org	[reply]