Bogofilter Spam Filter Retraining
I tried the technique called training to exaustion, and it worked great! As far as I can tell, you run a script (included in the bogofilter contrib directory), passing it a list of known spam, and known non-spam messages. It then runs each message through it's wordlists and sees if it is determined to be spam or non-spam. If it's correct, it moves onto the next message. If not, it re-classifies, or re-looks through the word-lists, or something, until that message is classified properly. I ran from my home directory (right out of the docs):
bogominitrain.pl -fn .bogofilter mail/notspam mail/spam '-o 0.8,0.2'
I was lucky enough to have some 34,000 messages available to work off of, and after a fair amount of time and numerous "NN false positives, NN false negatives" messages, it quit. Since then (noonish thursday) I've had one message slip through, which after 3-4 an hour, is pretty damn good.
So if you use bogofilter, I suggest checking this out, in conjunction with the .17.5 release.