Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Quick and dirty http server connection status checker

by superfrink (Curate)
on Aug 30, 2012 at 16:18 UTC ( [id://990793]=CUFP: print w/replies, xml ) Need Help??

Nodes in some load balanced web servers were taking turns going down. They were taking too long to respond to TCP connections.

I didn't have any graphing or monitoring. I wrote something quickly to tell me when a server was taking too long to respond. Then I could login and debug.

This program prints a warning when the connection does not complete in 1 second. I also added a signal handler to dump a summary of the connection statistics.

Update: The code is on github.

Sample output:
$ ./sas-monitoring ./sas-monitoring is starting as pid 27925. From another window you ma +y run: kill -USR1 27925 ...................................................................... +..................................................................... +......Could not create socket to 10.3.7.107 : Connection timed out -- 8< -- output removed here -- 8< -- 10.3.7.62: failures: 103 successes: 1092 10.3.7.109: failures: 79 successes: 1116 10.3.7.78: failures: 78 successes: 1117 10.3.7.105: failures: 29 successes: 1166 10.3.7.108: failures: 28 successes: 1167 10.3.7.106: failures: 6 successes: 1189 10.3.7.79: failures: 3 successes: 1192 10.3.7.63: failures: 2 successes: 1193 10.3.7.60: failures: 1 successes: 1194 10.3.7.107: failures: 1 successes: 1194 10.3.7.61: failures: 0 successes: 1195
The code:
#!/usr/bin/perl -w # # file: sas-monitoring # purpose: monitor sas web servers use strict; use Data::Dumper; use IO::Socket; my @WEB_SERVERS = qw( 10.3.7.105 10.3.7.106 10.3.7.107 10.3.7.108 10.3.7.109 10.3.7.62 10.3.7.63 10.3.7.78 10.3.7.79 10.3.7.60 10.3.7.61 ); my $TIMEOUT_SECONDS = 1; my $SERVER_PORT = 80; # -- main -- my %connection_stats; # for tracking connection success/failure counts for (@WEB_SERVERS) { $connection_stats{$_} = {}; $connection_stats{$_}{success} = 0; $connection_stats{$_}{failure} = 0; } # GOAL : give the user a way to read connect stats via the USR1 signal +. # DOC : this signal will interrupt the socket connect call and count a +s a # failure $SIG{'USR1'} = sub { my @ip_list = keys %connection_stats; @ip_list = sort { $connection_stats{$b}{failure} <=> $connection_stats{$a}{fai +lure} } @ip_list; print "\n"; for my $i (@ip_list) { print sprintf("%16s", $i), ": failures: ", $connection_stats{$i}{failure}, " successes: ", $connection_stats{$i}{success}, "\n"; } }; print $0 ," is starting as pid ", $$ , ". From another window you may run:\n kill -USR1 ", $$, "\n"; while (1) { # GOAL : try to connect to the server SERVER: for my $server_ip (@WEB_SERVERS) { my $sock = new IO::Socket::INET ( PeerAddr => $server_ip, PeerPort => $SERVER_PORT, Proto => 'tcp', Timeout => $TIMEOUT_SECONDS, ); unless ($sock) { warn "Could not create socket to $server_ip : $!\n"; $connection_stats{$server_ip}{failure} ++; next SERVER; } $connection_stats{$server_ip}{success} ++; close($sock); syswrite(STDOUT, '.', 1); } # wait a second then check again #syswrite(STDOUT, "\n", 1); sleep(1); }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://990793]
Approved by ww
Front-paged by Tanktalus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-04-26 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found