Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: PMiltering fun

by rhesa (Vicar)
on Jun 20, 2008 at 13:19 UTC ( #693155=note: print w/replies, xml ) Need Help??


in reply to PMiltering fun

Wouldn't this be far easier accomplished by using the mydestination setting? That's how you control which hosts you accept mail for.

It's also a good idea to have a local caching resolver running anyway, especially if you have multiple machines in your local network. I use Bind, and have it forward unknown requests to my ISP. Those are then cached locally.

forwarders { 1.2.3.4; 2.3.4.5; };

I can also heartily recommend implementing dns blacklists in your smtp daemon. I'm very, very happy with zen.spamhaus.org, which drops about 75% of incoming spam. I also use some of the rfc-ignorant.org blacklists, but haven't really seen much benefit of it.

Top it all off with bogofilter or another (bayesian) spam filter, and email life is good again. I see no spam in my inbox, and only have about 5 to 10 emails per day that the spam filter couldn't classify. I can live with those numbers!

Replies are listed 'Best First'.
Re^2: PMiltering fun
by Tanktalus (Canon) on Jun 20, 2008 at 14:30 UTC

    The mydestination setting is fine if your destination list doesn't change. But, IMO, it suffers from the data-duplicated-multiple-times syndrome. I already have this information in my DNS, duplicating it somewhere else seems like a huge waste of scarce resources (that being my ability to remember to do this should I change my network topology).

    I plan on inserting a spam filter, too, but last time I tried, email crawled to a halt because my poor machine couldn't keep up with it. This is kind of the first step in reclaiming that: by eliminating over 90% of the spam based on bad domain names, I will only need to check 10%. Even that will likely bring my P3-550 to a crawling halt, so I'm going to have to set up a distributed spam check (spamd running on another machine) somehow.

    Running a caching bind server on a small machine vs caching my own lookups... hmm... ;-) I suspect that for this machine, it's cheaper in both CPU and RAM to cache inside my milter.

    As for a rbl, I didn't really think of trying it until this. So thanks :-) (It makes me even more glad I posted this - I never would have imagined such a useful response, but I got it anyway.)

      There are settings for Postfix to only accept mail for domains for which it is the MX record. That would solve that problem. The mydestination setting isn't duplicated data, though, because I can easily set up a non-public email domain for testing purposes. There are provisions in RFC 2821 for delivering to a machine with an A record with no MX record, too.

      If you really want robust spam filtering in Perl, you could install amavisd-new as your MX-receiving SMTP server and forward mail that passes to Postfix. I recommend having a spam address and a ham address that amavis uses for Bayesian learning. Configure that anything coming from your Postfix outbound SMTP server to Amavis at those addresses gets processed accordingly, and then training your Bayesian filter is as simple as forwarding mail.

      The most successful anti-spam technique I've ever found, though, is to keep track of the number of invalid recipients from particular blocks of addresses, typically /24 blocks. You can measure in percentages of overall "RCPT TO" requests that fail, or a threshold of failed receipts per hour/day. Then, you can reject mail at the SMTP level from those blocks or, like I did, reject or drop packets with iptables or ipfilter from those blocks on your MX server. The configuration for either Postfix or iptables is easy to wrap in Perl. (So are amavis, shorewall, and more, of course). Be sure to have a list of exceptions, though, because you might not want to cut yourself off from AOL, Yahoo, and other public email sites (I couldn't, using this for a commercial ISP). AOL has a list of all the ranges their outgoing email servers use, though, so they're pretty easy.

      Dropping at the packet level does break a few RFCs, the one I can recall presently being the section of RFC 2821 that each domain and host that accepts or routes mail should have a reachable postmaster address despite filtering (which almost nobody follows anyway, since sending to "postmaster" then just becomes an easy way to spam). The really accepted way to do it, though, is to return a 554 policy error with text like "Your network block has been spamming this server."

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://693155]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2020-09-22 05:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If at first I donít succeed, I Ö










    Results (128 votes). Check out past polls.

    Notices?