Free Spam Filtering for MacOS X: How To Do It - Version 1.5

Update August 28, 2002: This document is obsolete, it does not work with MacOS X 10.2. Apple has removed support for Unix mailboxes from Mail.app.
Update July 14, 2002: Spambouncer was updated to v1.5 on July 5, 2002, this new updated document reflects changes to the setup instructions, and included many improvements and suggestions. If you want the old obsolete version of this document, it is available here. I have also released a new document, Advanced Anti-Spamming Techniques for MacOS X. Please visit my Home Page for many other interesting MacOS X experiments, including live QuickTime video streaming.
Update July 29, 2002: I made some minor but important corrections and simplified ~/.fetchmailrc



Spam is reaching unbearable levels and people are desperate for a solution. MacOS X includes traditional Unix mail handling software that is perfect for filtering spam, it's already installed and merely needs to be configured and activated. Mail.app can access Unix internal mailboxes, now we can bring the power of Unix mail processing and filtering to this standard MacOS X application.
Unix mail utilities like fetchmail and procmail are intended for single-user workstations that don't need to run a full mail exchanger like Sendmail. You can configure fetchmail to grab your email from 1 or more ISP accounts, procmail will filter it with scripts from SpamBouncer, then you can use MacOS X Mail.app to grab your mail from a filtered Unix mailbox. This works especially well if you have a persistent network connection like a cable modem, you can let the Unix system download and filter your mail every 5 minutes, and Mail.app will announce new spam-filtered email as it arrives. No email is ever deleted, spam is merely diverted into a separate folder, so you can always retrieve any legitimate email that was filtered out by accident.
Full credit goes to this anonymous tip from MacOS X Hints but there are no instructions for spam filtering, so I'll put this all together in one How-To, with a few corrections and improvements.

1. Create Your Unix Mailbox. Open a Terminal window and type the following commands, in all these code examples, wherever you see the red text username you should substitute your MacOS X account name.

sudo touch /var/mail/username
sudo chmod 0700 /var/mail/username
sudo chown username /var/mail/username

Enter the Administrator password when prompted.

2. Configure procmail. First, create a procmail directory in your home directory, this is where the filter logs and spam scripts will reside, and where filtered spam will be dumped. Let's set up the spam-dump folders too. Run these commands in the terminal:

mkdir ~/procmail
mkdir ~/procmail/spam
mkdir ~/procmail/bulk
mkdir ~/procmail/block

Create a .procmailrc file in your home directory. I used BBEdit, but you could use Pico or any other text editor, as long as you save in Unix text file format, it has a special format for line-breaks. Note that this is ".procmailrc" with a period in front, so the file will be invisible in the Finder once you've created it. I like BBEdit's "Open Hidden" feature to edit dotfiles like this. Copy this text into the file ~/.procmailrc

SHELL=/bin/sh
DEFAULT=/var/mail/username
LOGFILE=/Users/username/procmail/procmail.log
:0:
$DEFAULT
We really should put the logfile in a more standard Unix location like /var/log/procmail.log but root owns that folder and we're trying to keep this simple. If you keep it in your home directory, you can easily monitor it, but it won't be automatically cleaned out and rotated with the rest of the system logfiles. So remember to delete the logfile once in a while or it might grow rather large over time. If you delete the file, procmail will start a fresh new log.

3. Configure fetchmail. The most basic configuration is for a simple POP email account at an Internet Service Provider. Use a text editor to create the file .fetchmailrc in your home directory, and copy the following two lines into it. Your ISP account info goes in "ISPusername" and "ISPpasswd" and your ISP's mailserver name goes in popmailserver (be sure to use quotation marks exactly like this example)

# set daemon 300        # optional daemon mode - polls every 300 seconds
poll popmailserver with proto pop3 user "ISPusername" password "ISPpasswd" options stripcr
Once you've got this all working you can un-comment the line with "set daemon 300" and fetchmail will run every 5 minutes, just remove the # at the start of the line. You can consult the fetchmail page for details on configuring various types of mail accounts.

4. Set Up Mail.app. Now we need to set up a mail handling directory, open a Terminal window and type these commands:

mkdir ~/Library/Mail/maildir
chmod 700 ~/Library/Mail/maildir

Your maildir really could be any name instead of maildir, as long as it has no spaces in the name. As far as I can tell, Mail.app doesn't store anything in this folder but it does need to be present to filter the mail.
Now open Mail.app and set up a new account, set the type to Unix Account. In the Account Information tab, put in your information just like you did when setting up your ISP's email, set up your regular ISP email address and outgoing mail info (SMTP host, account info) but the middle panel with Hostname/Username/Password should be left blank. Now click on Account Options. Set the Incoming Mail Directory to /var/mail and set the Account Directory to ~/Library/Mail/maildir and make sure the two check boxes are checked. Hit OK and you're configured. I hope you get it right the first time, Unix mail account prefs can't be edited and if you made an error, you'll have to delete the mail account and try again.

5. Test it. Let's see if fetchmail and procmail can download mail correctly, before we put the spam filters in place. Go to the Terminal and just type "fetchmail" and you should see a brief message to indicate your mail was downloaded. Or not. If you didn't have any mail to fetch, send yourself a test email and try again. Then run Mail.app, execute the command Get Mail, and your incoming mail should now appear. Manually retrieving your mail is now a two step process, first run fetchmail in the terminal, then get mail in Mail.app. You can leave Mail.app open and automatically checking for mail, it will only see new mail when you run fetchmail and new mail is retrieved. If you use a dialup modem you will want to manually run fetchmail only when you're connected to the net, but if you have a full-time connection to the net, you'll want to set it up to launch and run automatically, as in step 8. Be sure to get your new mail system working and well tested before you try the next step, spam filtering.

6. Install SpamBouncer. Go to the SpamBouncer site and download the new version 1.5. It's in tar.z format, but Stuffit should expand it nicely. Rename the expanded "sb folder" to "spambouncer" and drop it into your ~/procmail folder. Read the website for Spambouncer configuration details, it is important to understand the system and how it works and how the filtering options work. Fortunately, all these configuration options are contained in simple text files. I'll walk you through a basic configuration.
We'll need to set up a new ~/.procmailrc and make a few decisions about how we want to filter mail. Here's a sample .procmailrc that should work as we're configuring things in this document:

LOGFILE=/Users/username/procmail/procmail.log
DEFAULT=/var/mail/username
FORMAIL=/usr/bin/formail
SBDIR=/Users/username/procmail/spambouncer
ADMINFOLDER=${DEFAULT}
ALTFROM=user@domain.com
BLOCKFOLDER=/Users/username/procmail/block
BLOCKREPLY=SILENT
BULKFOLDER=/Users/username/procmail/bulk
BYPASSWD=secretpassword
CHINESE=no
DATE=/bin/date
DEBUG=no
DORKSLCHECK=no
DSBLCHECK=no
DSBLMULTICHECK=no
DULCHECK=no
FILTER=no
FREEMAIL=INTERNAL
FTSGDIALCHECK=no
FTSGIGNORECHECK=no
FTSGMULTICHECK=no
FTSGOPTOUTCHECK=no
FTSGOTHERCHECK=no
FTSGRSSCHECK=no
FTSGSRCCHECK=no
FTSGWEBFORMCHECK=no
GARBLEDCHARSET=yes
GLOBALNOBOUNCE=NONE
GREP=/usr/bin/fgrep
JAPANESE=yes
KOREAN=no
LEAN=no
LEGITLISTS=/Users/username/procmail/spambouncer/legitlists
MONKEYFORMMAILCHECK=no
MONKEYPROXYCHECK=no
MYEMAIL=/Users/username/procmail/myemail
NOBOUNCE=/Users/username/procmail/nobounce
NOLOOP=${ALTFROM}
NSLOOKUP=/usr/bin/nslookup
ORDBCHECK=no
OSDIALCHECK=no
OSHAVENCHECK=no
OSOOLCHECK=no
OSOPSCHECK=no
OSORCHECK=no
OSSHRCHECK=no
OSSPAMCHECK=no
PATTERNMATCHING=SILENT
PROXYSOCKS=no
RBLCHECK=no
RFCIPWHOISCHECK=no
RM=/bin/rm
RSLCHECK=no
RSSCHECK=no
RUSSIAN=no
SENDMAIL=/dev/null
SPAMFOLDER=/Users/username/procmail/spam
SPAMCOPCHECK=no
SPAMHAUSORGCHECK=no
SPAMREPLY=SILENT
SPEWSCHECK=no
TEST=/bin/test
THISISP=yourisp.com
TURKISH=no
TWOMBITCHECK=no
VIRUSFOLDER=/Users/username/procmail/spam
INCLUDERC=/Users/username/procmail/spambouncer/sb.rc

This sample file contains all the necessary preferences to start spam filtering, as long as you put your MacOS X account name in place of username. and your ISP's domain name in THISISP. I've set a few options, I especially hate spam in foreign languages I can't read, so I've set it to remove Chinese, Turkish, etc. but I do get email in Japanese so I set the filters to permit that encoding. You should also set a unique secret password so if someone cannot get through the filters, they can just put the password in the Subject: field and it will always pass. SpamBouncer has a feature to automatically send complaint letters about spam, but I think it's better to not complain, you just get put on spammer's revenge lists. So I've set the SPAMREPLY=SILENT so it will just filter spam and not send any email.
Now we have to tune the filters a little. Use a text editor to create the file ~/procmail/myemail and put in your own email address (remember to save the text file in Unix format). If you have any domains that you want to always pass unfiltered, put the domain names in the file ~/procmail/nobounce like this:

importantdomain.com
letmethrough.com

Do not put your own domain in this file, or spam will slip through the filters.
SpamBouncer will filter almost any message that looks like it was sent automatically, including any legitimate mailing lists you subscribe to. You will have to explicitly grant permission for any mailing list you want to pass through the filters. Fortunately, this is easy. If you subscribe to any email lists, add the lists' originating email addresses, each address on a new line, in the file ~/procmail/spambouncer/legitlists. Now we are finished configuring our filters! Each time fetchmail pulls mail from the remote mail server to the local machine, procmail automatically processes it, and puts it where Mail.app will find it.

7. Monitor Spam Filtering. Even with these simple settings, SpamBouncer is very agressive, you will need to keep an eye on procmail.log and your spam dump folders to see if any legitimate email is being improperly filtered, then tune the filter options. I recommend you use either FileMonitor or BetterConsole to monitor your ~/procmail/procmail.log file. I like to set BetterConsole to pop up for 10 seconds each time procmail.log changes, I can watch it pop up when it's killing spam. It is almost more fun watching the spam die than getting real email!
The real action is going to be in your ~/procmail/spam and ~/procmail/block folders, the spam and suspected spam messages are dumped in these folders. Anything in the spam folder is definitely spam, you can periodically dump it all in the trash. Or you could delete the spam as it arrives by changing the SPAMFOLDER line in your ~/.procmailrc file to:

SPAMFOLDER=/dev/null

The block folder contains items that Spambouncer thinks are probably spam, but it may contain false positives, so this configuration is designed to allow you to check your logs and see if anything important was mistakenly filtered, then you can pull it out of the block folder and send it back to your mailbox. If you want to rescue a message from the block folder, the procedure is fairly simple. As an example, if your message is stored as ~/procmail/block/msg.yoqC you would type at the Terminal:

mv ~/procmail/block/msg.yoqC /var/mail/username

You should be careful to run this command when fetchmail and procmail are inactive and not processing incoming mail. You're writing directly to the incoming mail file without checking the lockfile to see if the file is in use. This is a bad thing, you could overwrite incoming mail if you do this command at just the wrong time. So be careful. The proper Unix way to do this would be to kill the fetchmail process and stop incoming mail before issuing the mv command. Instructions on killing and restarting fetchmail are included in the next step.
I've never seen anything appear in the Bulk folder. Periodically you should empty out these folders, the cool Unixy way to do it would be to set up a cron job to empty these folders once a week. You can use the utility Cronnix to set this up, but be careful not to disrupt the existing system scripts. I prefer to check the block folder for legitimate email before I dump it, so I still empty this folder manually.
Note that there is a bug in procmail, the notorious MacOS X HFS+ upper/lower case filename bug. Procmail stores each spam with a unique filename. If "msg.tgPB" already exists in your block folder, procmail will freak out when it tries to save the file "msg.tgpb" and it will pass the message to your mailbox, even though Spambouncer tried to dump it. The solution is to empty your block folder often. This problem doesn't happen often, but if a spam ever gets through the filters, this is usually why.

8. Optional: Set fetchmail to Run as a Daemon at Startup. If you have a cable modem or other permanent network connection, you'll want to set fetchmail to run at login and check mail automatically. Here's a clever trick to run a command line script at startup. Fix the file ~/.fetchmailrc to uncomment the line as described in Step 3. Now fetchmail will run and instead of quitting, it will continue to run in the background in daemon mode, checking mail every 300 seconds. Now create a text file (in Unix line-break format again) called "fetchmail.command" with just this single word:

fetchmail

A good place to save this would be in /Applications. Now go to the MacOS X System Preferences-Login, you can just drag the fetchmail.command file into the Login Items window, it will automatically run in a terminal window each time you log in. Be sure to set your Mail.app preferences to automatically check email every 5 minutes, and your filtered email will now automatically appear in your inbox as it arrives. This script would also be a really good place to put commands to set up SSH tunnelling, so you can use an encrypted SSH link to check your email. Be aware that your password is being sent in plaintext by fetchmail, just like it does with a regular unencrypted mail program. Fetchmail can grab mail through an SSH tunnel, your passwords and mail will be sent through the net encrypted, which is a very good thing. Unfortunately SSH Tunnelling is beyond the scope of this document (in other words, I haven't got it working yet).
If you ever want to check if fetchmail is still running in daemon mode, just type "fetchmail" in a Terminal window, it will go grab your mail immediately, or restart it if it is not running. Once in a rare while, fetchmail will continue to run but stops retrieving mail, due to a problem with the lockfile. If this happens, you could just reboot MacOS X, but there is a proper Unix way to handle problems like this. Run this command in the Terminal:

ps -x | grep fetchmail

You'll get a response something like:

  851  ??  Ss     0:02.88 fetchmail
 2545 std  R+     0:00.00 grep fetchmail
This shows us the Process ID of fetchmail (ignore that second line, it's the grep command we just made, searching for the word "fetchmail"). That first number is the PID, so take that number (it will be different each time) then kill that process and restart fetchmail:
kill 851
fetchmail

You may get a message about removing a stale lockfile, and this is a good thing. Fetchmail is now running in the background again.

This is everything you need to know to set up basic spam filtration and keep it running. For further information, please consult my new document Advanced Anti-Spamming Techniques for MacOS X


Disclaimer: Please don't email asking me for Unix support or help if you can't get this to work, Unix configuration can be tricky, I am presenting this example as a guideline, but if you're going to do this, you will have to Do It Yourself, in the true Unix spirit. It isn't that hard, I patched this all together just by Reading The Fine Manuals for Spambouncer and procmail. This document has been successfully tested and I believe it's the simplest, most correct procedure for implementing spam filtration. If you can offer any corrections or improvements, please contact me and I will update this page.

A Note of Thanks: There are too many people to thank individually for their contributions, testing, and improvements to this document, but you know who you are and you have my gratitude. On behalf of myself and my readers, I have delivered special thanks to Catherine Hampton for her continued efforts programming and maintaining SpamBouncer, and for giving it to us as Free Software.