Once you select what is spam and what's not, and you have two new mboxes, you'd better off doing a tour on them with sa-learn from the SpamAssassin distribution. You'll see that your spamassassin will have far less false positives
From the sa-learn man page:
NAME
sa-learn - train SpamAssassin's Bayesian classifier
SYNOPSIS
sa-learn [options] [file]...
[...]
Options:
--ham Learn messages as ham (non-s
+pam)
--spam Learn messages as spam
[...]
--mbox Input sources are in mbox fo
+rmat
--showdots Show progress using dots
--no-rebuild Skip building databases afte
+r scan
[...]
DESCRIPTION
Given a typical selection of your incoming mail classified
as spam or ham (non-spam), this tool will feed each mail
to SpamAssassin, allowing it to 'learn' what signs are
likely to mean spam, and which are likely to mean ham.
Simply run this command once for each of your mail fold
ers, and it will ''learn'' from the mail therein.
[...]
SpamAssassin remembers which mail messages it's learnt
already, and will not re-learn those messages again,
unless you use the --forget option. Messages learnt as
spam will have SpamAssassin markup removed, on the fly.
Ciao! --bronto
The very nature of Perl to be like natural language--inconsistant and full of dwim and special cases--makes it impossible to know it all without simply memorizing the documentation (which is not complete or totally correct anyway).
--John M. Dlugosz
|