Advanced Anti-Spam Techniques for MacOS X
This document is based on the fetchmail and procmail configuration described
in Free
Spam Filtering for MacOS X: How To Do It. If you haven't read that document,
go back and read it now. These instructions are based on that specific MacOS
X configuration, but should be useful for any general Unix procmail environment.
Killing Persistent Spammers with Procmail
Most of your spam will come from a few dedicated spammers, and you can cut
the majority of your incoming spam by filtering those spammers directly. Sometimes
you get one particular spammer that continually sends you junk mail, anything
sent from that address or domain will always be spam. With procmail,
it is easy to send these messages straight to /dev/null, the Unix equivalent
of the trash can. You can insert a simple, short procmail script into your
~/procmailrc file that will kill specific addresses, right before the SpamBouncer
scripts are called. The mails are rejected before SpamBouncer sees them, so
it can save processing time in Spambouncer. You will have the ultimate level
of control over your incoming mailbox.
A perfect example happened this week. Some idiot spammer sent me 2MB spam
with a Powerpoint file attached. Procmail took a lot of processing
time to run the SpamBouncer script, while my computer (a relatively slow G3/400)
ground to a halt. And then they sent it again. And again. I put them
in the filters, and blocked 5 more incoming 2MB spams. Now you can see why
I recommend you check your procmail logs, to keep an eye out for this sort
of stupid spam stunt. It was easy to look at these emails and see they were
all sent from the same address, some spammers make it easy for you to filter
them. Here's a sample script to filter email with address in the From: field.
VIRUSFOLDER=/Users/username/procmail/spam :0 * ^From.*(spammer@spam.com) /dev/null INCLUDERC=/Users/username/procmail/spambouncer/sb.rc
This example script is shown in context , to show how it is inserted in ~/.procmailrc
right before the last line INCLUDERC that calls the SpamBouncer filter. Remember
your ~/.procmailrc will have your MacOS X username in place of username.
We'll omit the context from our next examples, and just focus on the three
lines that do all the work.
The script looks at the From field of incoming emails, and if it finds a match,
the mail is immediately sent to /dev/null where it is erased instantly. You
can also put domain names into the parentheses, even a long list of names.
Use the concatenation symbol between the addresses. You can put an almost
unlimited number of addresses on the same line.
:0 * ^From.*(spammer@spam.com|junkmail.com|spambag.com) /dev/null
You can see this script blocks the whole domains junkmail.com and spambag.com.
I want you to stop and think about that for a second. Every single email from
anywhere inside those domains will be deleted instantly. If you put in a name
that is very broad, like hotmail.com or yahoo.com, you will never receive
any mail from those domains. So you better be darn sure you want to do this.
I try not to block huge domains, SpamBouncer tends to catch spams from free
emailers. But for small domains that do nothing but spam you persistently,
put them in the filters.
Some spammers are more clever, they use faked From: addresses, or use dozens
of From addresses but send from the same domain. These take a little more
effort to block. You will have to learn to read email headers. Fortunately,
SpamBouncer tags each rejected email with the characteristics of that spam,
sometimes it will tell you where to look. Here's a good example email header
from my own spam dump, I removed my real email address, to make it a little
harder for spammers to harvest my address from this page.
From ceicher Sun Jul 14 16:55:58 2002
Return-Path: <perf-errors.3565.65683.5914160.501.0.4@boing.topica.com>
Delivered-To: [removed]
Received: from soli.inav.net [64.6.64.4]
by localhost with POP3 (fetchmail-5.9.0)
for ceicher@localhost (single-drop); Sun, 14 Jul 2002 16:55:58 -0500 (CDT)
Received: (qmail 750 invoked by uid 0); 14 Jul 2002 16:50:48 -0500
Received: from out012.tfmb.net (HELO outmta020.topica.com) (66.180.247.32)
by soli.inav.net with SMTP; 14 Jul 2002 16:50:48 -0500
To: [removed]
From: ContentWatch <emailrewardz@emailrewardz.email-publisher.com>
Subject: Advisory: Hidden file danger
Date: Sun, 14 Jul 2002 14:50:47 -0700
Message-ID: <65683.3565.1769412112-1463747838-1026683447@topica.com>
Errors-To: <perf-errors.3565.65683.5914160.501.0.4@boing.topica.com>
Reply-To: perf-remove.3565.65683.5914160.0.0.4@boing.topica.com
X-Topica-Id: <1026681038.svc001.8316.1000119>
Mime-Version: 1.0
Status: U
X-UIDL: 1026683448.753.soli.inav.net
Content-Type: multipart/alternative;
boundary="TEP-1545058628.1463793150.1026680402"
X-SpamBouncer: 1.5 (6/13/02)
X-SBRule: Pattern Match (Disclaimer) (Score: 9656)
X-SBRule: Pattern Match (Web Hosting) (Score: 800)
X-SBRule: Pattern Match (Haven Domain) (Score: 0)
X-SBRule: topica.com mailing list
X-SBClass: Blocked
You can see that this mail appears to come from emailrewardz@emailrewardz.email-publisher.com,
but it really doesn't. That name alone should alert you that this is a persistent,
devious spammer. But notice that the Received: line includes the text (HELO
outmta020.topica.com). This is the true source of the spam. The X-Spambouncer
headers confirm that this email was sent from Topica.com, a "haven domain" that
exists solely to spam. There are other ways to identify the true sender by reading
the headers, you might want to read this FAQ
or this HowTo on
this topic. :0 * (topica.com) /dev/nullNote that unlike the previous scripts that search only the From: field, this script searches the entire document. This is a little bit risky, since you could get a legitimate email from someone that writes "I sure get a lot of spam from topica.com" and this script would delete the message. So be careful. It would be useful to read some of the fine procmail tutorials to learn how to refine these scripts.
:0 * (big5|gb2312|euc-kr|ks_c_5601-1987) /dev/null
Now that we have a couple of different scripts that serve different purposes,
we can put several of these short 3-line procmail scripts together in a row.
As long as you insert them just before the final line where sb.rc is called,
the scripts will all run in sequence before SpamBouncer, each script can discard
spam without further processing.
There is one problem with this strategy: bloat. I've been using this technique
for only a few months and already I have over 100 domains blocked. Over time,
this file will grow larger and larger, and take longer and longer to process.
Also, spammers tend to throw away old domain names and use new ones, so you
never know how many entries in your filter are obsolete and useless. So don't
just throw every spammer's address in the script, just the ones that
send you the most mail.
The good news is that this strategy is pretty effective. I used to get about
50 to 60 spams in my block folder each day. After adding new names to my filters
for a few weeks, spam is down to about 1 every few days. In fact, my filters
work so well, I had to turn them off to collect even a single sample spam
so I could write this documentation! I hope this technique works as well for
you as it does for me. This is a classic case study in adapting Open Source
software to the Mac environment. But I suspect this is the end of the road
for this type of spam filtration, Apple has announced that their next MacOS
X release will upgrade Mail.app to include spam filters. These procmail techniques
may be obsolete in a matter of weeks. Or they may continue to be valuable.
Perhaps Apple is even using SpamBouncer. We will know soon.