Free Spam Filtering for MacOS X: How To Do It - Version
1.5
Update August 28, 2002: This document is obsolete, it does not work
with MacOS X 10.2. Apple has removed support for Unix mailboxes from Mail.app.
Update July 14, 2002: Spambouncer was updated to v1.5 on July 5, 2002,
this new updated document reflects changes to the setup instructions, and
included many improvements and suggestions. If you want the old obsolete version
of this document, it is available here.
I have also released a new document, Advanced
Anti-Spamming Techniques for MacOS X. Please visit my Home
Page for many other interesting MacOS X experiments, including
live QuickTime video streaming.
Update July 29, 2002: I made some minor but important corrections and
simplified ~/.fetchmailrc
1. Create Your Unix Mailbox. Open a Terminal window and type the following commands, in all these code examples, wherever you see the red text username you should substitute your MacOS X account name.
sudo touch /var/mail/username sudo chmod 0700 /var/mail/username sudo chown username /var/mail/username
Enter the Administrator password when prompted.
2. Configure procmail. First, create a procmail directory in your home directory, this is where the filter logs and spam scripts will reside, and where filtered spam will be dumped. Let's set up the spam-dump folders too. Run these commands in the terminal:
mkdir ~/procmail mkdir ~/procmail/spam mkdir ~/procmail/bulk mkdir ~/procmail/block
Create a .procmailrc file in your home directory. I used BBEdit, but you could use Pico or any other text editor, as long as you save in Unix text file format, it has a special format for line-breaks. Note that this is ".procmailrc" with a period in front, so the file will be invisible in the Finder once you've created it. I like BBEdit's "Open Hidden" feature to edit dotfiles like this. Copy this text into the file ~/.procmailrc
SHELL=/bin/sh DEFAULT=/var/mail/username LOGFILE=/Users/username/procmail/procmail.log
:0: $DEFAULTWe really should put the logfile in a more standard Unix location like /var/log/procmail.log but root owns that folder and we're trying to keep this simple. If you keep it in your home directory, you can easily monitor it, but it won't be automatically cleaned out and rotated with the rest of the system logfiles. So remember to delete the logfile once in a while or it might grow rather large over time. If you delete the file, procmail will start a fresh new log.
3. Configure fetchmail. The most basic configuration is for a simple POP email account at an Internet Service Provider. Use a text editor to create the file .fetchmailrc in your home directory, and copy the following two lines into it. Your ISP account info goes in "ISPusername" and "ISPpasswd" and your ISP's mailserver name goes in popmailserver (be sure to use quotation marks exactly like this example)
# set daemon 300 # optional daemon mode - polls every 300 seconds poll popmailserver with proto pop3 user "ISPusername" password "ISPpasswd" options stripcrOnce you've got this all working you can un-comment the line with "set daemon 300" and fetchmail will run every 5 minutes, just remove the # at the start of the line. You can consult the fetchmail page for details on configuring various types of mail accounts.
4. Set Up Mail.app. Now we need to set up a mail handling directory, open a Terminal window and type these commands:
mkdir ~/Library/Mail/maildir chmod 700 ~/Library/Mail/maildir
Your maildir really could be any name instead of maildir, as long as it has
no spaces in the name. As far as I can tell, Mail.app doesn't store anything
in this folder but it does need to be present to filter the mail.
Now open Mail.app and set up a new account, set the type to Unix Account.
In the Account Information tab, put in your information just like you did
when setting up your ISP's email, set up your regular ISP email address and
outgoing mail info (SMTP host, account info) but the middle panel with Hostname/Username/Password
should be left blank. Now click on Account Options. Set the Incoming Mail
Directory to /var/mail and set the Account Directory to ~/Library/Mail/maildir
and make sure the two check boxes are checked. Hit OK and you're configured.
I hope you get it right the first time, Unix mail account prefs can't be edited
and if you made an error, you'll have to delete the mail account and try again.
5. Test it. Let's see if fetchmail and procmail can download mail correctly, before we put the spam filters in place. Go to the Terminal and just type "fetchmail" and you should see a brief message to indicate your mail was downloaded. Or not. If you didn't have any mail to fetch, send yourself a test email and try again. Then run Mail.app, execute the command Get Mail, and your incoming mail should now appear. Manually retrieving your mail is now a two step process, first run fetchmail in the terminal, then get mail in Mail.app. You can leave Mail.app open and automatically checking for mail, it will only see new mail when you run fetchmail and new mail is retrieved. If you use a dialup modem you will want to manually run fetchmail only when you're connected to the net, but if you have a full-time connection to the net, you'll want to set it up to launch and run automatically, as in step 8. Be sure to get your new mail system working and well tested before you try the next step, spam filtering.
6. Install SpamBouncer. Go to the SpamBouncer
site and download the new
version 1.5. It's in tar.z format, but Stuffit should expand it nicely.
Rename the expanded "sb folder" to "spambouncer" and drop
it into your ~/procmail folder. Read the website for Spambouncer configuration
details, it is important to understand the system and how it works and how
the filtering options work. Fortunately, all these configuration options are
contained in simple text files. I'll walk you through a basic configuration.
We'll need to set up a new ~/.procmailrc and make a few decisions about how
we want to filter mail. Here's a sample .procmailrc that should work as we're
configuring things in this document:
LOGFILE=/Users/username/procmail/procmail.log
DEFAULT=/var/mail/username
FORMAIL=/usr/bin/formail
SBDIR=/Users/username/procmail/spambouncer
ADMINFOLDER=${DEFAULT}
ALTFROM=user@domain.com
BLOCKFOLDER=/Users/username/procmail/block
BLOCKREPLY=SILENT
BULKFOLDER=/Users/username/procmail/bulk
BYPASSWD=secretpassword
CHINESE=no
DATE=/bin/date
DEBUG=no
DORKSLCHECK=no
DSBLCHECK=no
DSBLMULTICHECK=no
DULCHECK=no
FILTER=no
FREEMAIL=INTERNAL
FTSGDIALCHECK=no
FTSGIGNORECHECK=no
FTSGMULTICHECK=no
FTSGOPTOUTCHECK=no
FTSGOTHERCHECK=no
FTSGRSSCHECK=no
FTSGSRCCHECK=no
FTSGWEBFORMCHECK=no
GARBLEDCHARSET=yes
GLOBALNOBOUNCE=NONE
GREP=/usr/bin/fgrep
JAPANESE=yes
KOREAN=no
LEAN=no
LEGITLISTS=/Users/username/procmail/spambouncer/legitlists
MONKEYFORMMAILCHECK=no
MONKEYPROXYCHECK=no
MYEMAIL=/Users/username/procmail/myemail
NOBOUNCE=/Users/username/procmail/nobounce
NOLOOP=${ALTFROM}
NSLOOKUP=/usr/bin/nslookup
ORDBCHECK=no
OSDIALCHECK=no
OSHAVENCHECK=no
OSOOLCHECK=no
OSOPSCHECK=no
OSORCHECK=no
OSSHRCHECK=no
OSSPAMCHECK=no
PATTERNMATCHING=SILENT
PROXYSOCKS=no
RBLCHECK=no
RFCIPWHOISCHECK=no
RM=/bin/rm
RSLCHECK=no
RSSCHECK=no
RUSSIAN=no
SENDMAIL=/dev/null
SPAMFOLDER=/Users/username/procmail/spam
SPAMCOPCHECK=no
SPAMHAUSORGCHECK=no
SPAMREPLY=SILENT
SPEWSCHECK=no
TEST=/bin/test
THISISP=yourisp.com
TURKISH=no
TWOMBITCHECK=no
VIRUSFOLDER=/Users/username/procmail/spam
INCLUDERC=/Users/username/procmail/spambouncer/sb.rc
This sample file contains all the necessary preferences to start spam filtering,
as long as you put your MacOS X account name in place of username.
and your ISP's domain name in THISISP. I've set a few options, I especially
hate spam in foreign languages I can't read, so I've set it to remove Chinese,
Turkish, etc. but I do get email in Japanese so I set the filters to permit
that encoding. You should also set a unique secret password so if someone
cannot get through the filters, they can just put the password in the Subject:
field and it will always pass. SpamBouncer has a feature to automatically
send complaint letters about spam, but I think it's better to not complain,
you just get put on spammer's revenge lists. So I've set the SPAMREPLY=SILENT
so it will just filter spam and not send any email.
Now we have to tune the filters a little. Use a text editor to create the
file ~/procmail/myemail and put in your own email address (remember to save
the text file in Unix format). If you have any domains that you want to always
pass unfiltered, put the domain names in the file ~/procmail/nobounce like
this:
importantdomain.com letmethrough.com
Do not put your own domain in this file, or spam will slip through the filters.
SpamBouncer will filter almost any message that looks like it was sent automatically,
including any legitimate mailing lists you subscribe to. You will have to
explicitly grant permission for any mailing list you want to pass through
the filters. Fortunately, this is easy. If you subscribe to any email lists,
add the lists' originating email addresses, each address on a new line, in
the file ~/procmail/spambouncer/legitlists. Now we are finished configuring
our filters! Each time fetchmail pulls mail from the remote mail server to
the local machine, procmail automatically processes it, and puts it where
Mail.app will find it.
7. Monitor Spam Filtering. Even with these simple settings, SpamBouncer
is very agressive, you will need to keep an eye on procmail.log and your spam
dump folders to see if any legitimate email is being improperly filtered,
then tune the filter options. I recommend you use either FileMonitor
or BetterConsole
to monitor your ~/procmail/procmail.log file. I like to set BetterConsole
to pop up for 10 seconds each time procmail.log changes, I can watch it pop
up when it's killing spam. It is almost more fun watching the spam die than
getting real email!
The real action is going to be in your ~/procmail/spam and ~/procmail/block
folders, the spam and suspected spam messages are dumped in these folders.
Anything in the spam folder is definitely spam, you can periodically dump
it all in the trash. Or you could delete the spam as it arrives by changing
the SPAMFOLDER line in your ~/.procmailrc file to:
SPAMFOLDER=/dev/null
The block folder contains items that Spambouncer thinks are probably spam, but it may contain false positives, so this configuration is designed to allow you to check your logs and see if anything important was mistakenly filtered, then you can pull it out of the block folder and send it back to your mailbox. If you want to rescue a message from the block folder, the procedure is fairly simple. As an example, if your message is stored as ~/procmail/block/msg.yoqC you would type at the Terminal:
mv ~/procmail/block/msg.yoqC /var/mail/username
You should be careful to run this command when fetchmail and procmail are
inactive and not processing incoming mail. You're writing directly to the
incoming mail file without checking the lockfile to see if the file is in
use. This is a bad thing, you could overwrite incoming mail if you do this
command at just the wrong time. So be careful. The proper Unix way to do this
would be to kill the fetchmail process and stop incoming mail before issuing
the mv command. Instructions on killing and restarting fetchmail are included
in the next step.
I've never seen anything appear in the Bulk folder. Periodically you should
empty out these folders, the cool Unixy way to do it would be to set up a
cron job to empty these folders once a week. You can use the utility Cronnix
to set this up, but be careful not to disrupt the existing system scripts.
I prefer to check the block folder for legitimate email before I dump it,
so I still empty this folder manually.
Note that there is a bug in procmail, the notorious MacOS X HFS+ upper/lower
case filename bug. Procmail stores each spam with a unique filename. If "msg.tgPB"
already exists in your block folder, procmail will freak out when it tries
to save the file "msg.tgpb" and it will pass the message to your
mailbox, even though Spambouncer tried to dump it. The solution is to empty
your block folder often. This problem doesn't happen often, but if a spam
ever gets through the filters, this is usually why.
8. Optional: Set fetchmail to Run as a Daemon at Startup. If you have a cable modem or other permanent network connection, you'll want to set fetchmail to run at login and check mail automatically. Here's a clever trick to run a command line script at startup. Fix the file ~/.fetchmailrc to uncomment the line as described in Step 3. Now fetchmail will run and instead of quitting, it will continue to run in the background in daemon mode, checking mail every 300 seconds. Now create a text file (in Unix line-break format again) called "fetchmail.command" with just this single word:
fetchmail
A good place to save this would be in /Applications. Now go to the MacOS
X System Preferences-Login, you can just drag the fetchmail.command file into
the Login Items window, it will automatically run in a terminal window each
time you log in. Be sure to set your Mail.app preferences to automatically
check email every 5 minutes, and your filtered email will now automatically
appear in your inbox as it arrives. This script would also be a really good
place to put commands to set up SSH tunnelling, so you can use an encrypted
SSH link to check your email. Be aware that your password is being sent in
plaintext by fetchmail, just like it does with a regular unencrypted mail
program. Fetchmail can grab mail through an SSH tunnel, your passwords and
mail will be sent through the net encrypted, which is a very good thing. Unfortunately
SSH Tunnelling is beyond the scope of this document (in other words, I haven't
got it working yet).
If you ever want to check if fetchmail is still running in daemon mode, just
type "fetchmail" in a Terminal window, it will go grab your mail
immediately, or restart it if it is not running. Once in a rare while, fetchmail
will continue to run but stops retrieving mail, due to a problem with the
lockfile. If this happens, you could just reboot MacOS X, but there is a proper
Unix way to handle problems like this. Run this command in the Terminal:
ps -x | grep fetchmail
You'll get a response something like:
851 ?? Ss 0:02.88 fetchmail 2545 std R+ 0:00.00 grep fetchmailThis shows us the Process ID of fetchmail (ignore that second line, it's the grep command we just made, searching for the word "fetchmail"). That first number is the PID, so take that number (it will be different each time) then kill that process and restart fetchmail:
kill 851 fetchmail
You may get a message about removing a stale lockfile, and this is a good thing. Fetchmail is now running in the background again.
This is everything you need to know to set up basic spam filtration and keep it running. For further information, please consult my new document Advanced Anti-Spamming Techniques for MacOS X
Disclaimer: Please don't email asking me for Unix support or help
if you can't get this to work, Unix configuration can be tricky, I am presenting
this example as a guideline, but if you're going to do this, you will have
to Do It Yourself, in the true Unix spirit. It isn't that hard, I patched
this all together just by Reading The Fine Manuals for Spambouncer and procmail.
This document has been successfully tested and I believe it's the simplest,
most correct procedure for implementing spam filtration. If you can offer
any corrections or improvements, please contact me and I will update this
page.
A Note of Thanks: There are too many people to thank individually for
their contributions, testing, and improvements to this document, but you know
who you are and you have my gratitude. On behalf of myself and my readers,
I have delivered special thanks to Catherine Hampton for her continued efforts
programming and maintaining SpamBouncer, and for giving it to us as Free Software.