http://qs321.pair.com?node_id=252591

Notromda has asked for the wisdom of the Perl Monks concerning the following question:

I'm getting really frustrated with spammers that are pounding my mailservers from multiple IP addresses and attempting to find valid email addresses by brute force. I'd like to have a program that can track and detect an abnormally high number of errors from the same IP address and fire off firewall rules to reject connections from problem IP addresses for a while. The logfile analysis and firewall parts may not be on the same computer.

Are there any known solutions to this problem? If not, here's what I'm thinking about doing in perl...

I already have a program that can read the logfile as it is being written - syslog-ng pipes the logs to the perl program. This is a really good example of where regular expressions shine. :)

I need to be able to calculate statistics about error rates for a multitude of IP addresses. RRDtool can make interesting graphs, but I think this would only be usefull as an aggregate measure, correct? I think it would be unweildly to store rrd data for every IP address I encounter.

So my first question is, what kind of data structure should I use to store the data collected, so I can check error rate information. Right now I'm thinking that I should ban an IP address for half an hour, and repeat offenders need to be escalated to a supervisor for more drastic measures. That means storing all that data around for a while.

I have a mysql database at my disposal, and I'm developing on a RH 8 platform.

One way that I see to do this is to store all errors in mysql with a timestamp, IP address, type of error, and any other information. Then on the box running a firewall (actually, iptables on the mail server) have a daemon that queries the database occasionally, calculates stats, and implements the firewall solution. It could also clean out the database after a specified period of time.

Are there any other solutions that I'm missing? Is there a more general soultion to this problem? If there is, I'll try to give whatever I come up with back to the community... :)