This article describes a simple and effective technique for countering spam -- ham passwords. A ham password is a special password you ask strangers, or senders in general, to include in email they send to you. In particular, a ham password can be easily added to a subject line. If a new sender wants you to receive an email, then the sender should include one of your ham passwords to prove that they are authorized to send the email to you. This technique is excellent in conjuction with other techniques to shore up their weaknesses. In particular, ham passwords can counter the major weakness of heuristic spam filters, which sometimes incorrectly label ham as spam. This approach is inexpensive, requires no changes to any software code, and is simple to understand. It also is under the control of the user, rather than ceding control over email to some external entity -- a critical requirement. It especially works well when combined with other techniques that handle non-strangers.
Spam (unsolicited bulk email) is an extremely serious problem, as I discuss in my essay on stopping spam and my paper on guarded email. Manual solutions (“delete by hand the spam you get”) simply don’t work, because many people (including me) get too much spam to review every spam message we get, and as spam increases ever more people will have that problem. As a result, if my spam filter thinks a message is spam, then that message must be automatically deleted without review; and many other people have (or will have) this problem. Automatically deleting messages has the risk of unintentionally deleting important messages from strangers, so I’ve had to think of ways to compensate for this problem. Although there’s much to like about challenge-response systems (such as my guarded email approach), they can be somewhat difficult to set up and maintain on many systems (since few systems directly support them), many people object to challenges, and so much email is forged today that the challenges themselves are often sent to people who didn’t send the email in the first place.
Some extreme anti-spam measures can work, but are completely inappropriate for me and for many other people. Some people are willing to accept only email from people they’ve talked with before, or are able to keep their email addresses completely secret. Neither option is appropriate for me; for my circumstances, complete strangers must be able to contact me. Any business will have this problem; you want to get new customers! My email address is too widespread and well-known at this point to hide it, and I do not want to lose my email address. Basically, spammers are trying to steal my email address (by making it impossible to use), and I don’t want to lose use of my email address due to the actions of these thieves.
I currently use a anti-spam approach called “ham passwords” that is often not discussed in the anti-spam literature (though since I wrote this piece, that’s changing). Although I briefly describe my approach on my website (and have for a while), I thought it’d be useful to describe it in more detail so that others could use it too. I suspect that others use this approach, since it’s an obvious one, but I haven’t seen it described anywhere else before. Pointers to previous descriptions would be welcome! I thought describing it would be helpful to others, so here’s my description! It’s similar to challenge-response systems, but there’s no challenge. While it’s not perfect, it definitely helps, and it really works well when combined with other anti-spam techniques. There are similar approaches (such as including password-like information in an email address), but ham passwords have some advantages as I discuss below.
The primary approach I use for countering spam is something I term a “ham password”. A ham password is a special password you ask senders (at least strangers) to include in email they send to you. In particular, you ask strangers to include this information; once you start conversing with someone, there are other ways to determine that the messages aren’t spam. Basically, think of the act of sending email as a privileged operation not permitted to just anyone: If a sender wants you to receive an email, then the sender should send a ham password to prove that they are authorized to send the email to you.
I ask strangers who want to send email to me to include the ham password in the subject line of any email message. It’s simple, it’s straightforward, and people will do it. If a ham password is included in the subject line, it’s scored in my spam filters as much less likely to be spam. Current spam filtering tools have the serious risk of labelling good mail (“ham”) as spam; ham passwords are an effective way of eliminating this risk of spam filters. Note that the ham password has to be in the subject line; a spam message that tries to put a dictionary in the subject line could be considered as an attack to start with, and looking only at the subject line means there is practically no performance impact.
If you have a ham password, you need to make sure that legitimate senders will be able to find it. There are many ways to do this; most people will probably want to do what I do, which is to use a single webpage that gives the email address and the ham password I want them to use. Generally, you shouldn’t just make either piece of information easy to read by machine. I give my email address as a set of instructions, and I make my ham password available as shrouded graphical image (a picture with intentional distortions). I also provide human-readable textual instructions on how to determine the ham password for the sight impaired. I use textual instructions on how to create the ham password (like “The password is the letter g, followed by ‘eorge’”), but you could use a question of fact instead (“What is the first name of the first U.S. president?”), especially if you only want certain kinds of people to be able to contact you.
It’s very important to include a textual version, so that sight-impaired people can still get the ham password. You also shouldn’t include the ham password as simple text or simply make a picture of the ham password, because automated systems run by spammers might be able to figure that out easily.
Distributing ham password information via a web page is really easy to do. Most ISPs give people a small website space as part of their package, and many sites let people create a small web page for free. Even if you don’t know how to create a website, you probably can find someone who can do it for you. A website is an especially easy way to distribute the ham password, but it isn’t strictly necessary. I give the ham password to people I meet in person, and other distribution methods can work too. On a business card, you can just say “(include ‘X’ in subject)” after the email address.
In general, try to make sure that both pieces of information (address and ham password) are on only one web page. That will make it easy to change the ham password -- the only way people will get your email address will be by going to a single location, which you can later edit to change the ham password. You can see an example in my contact information.
I augment this approach with a variety of other anti-spam techniques. No single technique does everything; what I’ve found is that ham passwords make other techniques much more effective.
First, it’s useful to support more than one ham password. This way, when you switch from one ham password to another, you can offer a transition time (particularly important if you put the ham password on business cards). It’s even more useful for mailing lists, as I discuss below.
Another anti-spam technique I use is a “reply indicator” which is something like a ham password in reverse. When I reply to email messages, I include a small sequence of text that appears reasonable and is likely to be included in any replies. It’s not malicious text, or a tracking device; it’s just a special sequence of text that others are unlikely to created exactly by accident. I don’t particularly note “this is a reply indicator” in my messages, so that messages that eventually get widely circulated don’t automatically tip off a spammer to what I use as the reply indicator. When I receive a message, I have filters look in its body for the reply indicator. If the message has the reply indicator, it’s extremely likely to be a reply to one of my previous messages, and I adjust its spam score to show that it’s unlikely to be spam. This isn’t needed for a ham password technique, but you may find it a helpful addition.
I also use traditional approaches to countering spam, in particular a whitelist, spam content filtering (including Bayesian filtering), a blacklist, and a “Junk mail” folder where apparantly-spam messages go. The whitelist means that people who I do talk to are unlikely to be caught by any of this stuff, so they don’t need to include ham passwords or anything like that. Once I reply to someone they are usually put on my whitelist (so I will automatically receive emails from that individual). By using traditional spam content filtering and blacklisting (such as implemented by SpamAssassin), there’s a fair chance that people who I’ve never contacted before, and don’t use my ham password, will still reach me... but at least those legitimate people who really want to contact me have a way of increasing their chances.
Sometimes companies send you confirmation messages after you interact with their website, for example, airlines may send flight information, or confirm that you have a new account. Sadly, most of them don’t let you control what’s in the subject line. That’s okay. In that case, I’m expecting the email, so if I don’t see it I’ll open up my Junk mail folder and search for the message I was supposed to get. Note that I don’t have to read every message subject line in the Junk mail folder... I just have to search for the organization whose message I was expecting. Often I expect the message immediately, so I don’t even have to formally search, I just open up my Inbox folder immediately after I’ve paid/registered, and if it’s not there, open up my Junk mail folder... it should be among the ten most recent messages!
Domain authentication techniques (like SPF and DomainKeys) and user authentication techniques (like digital signatures from S/MIME and OpenPGP) can make the whitelisting far more effective. Then things can work quite cleanly: imagine that strangers send you a ham password to get started, and any email you send or reply to has that email address added to the whitelist; spammers can’t exploit that easily, because anyone in the whitelist has their domain authenticated. I intend to check SPF information at some point, though SPF has serious problems with forwarding (and I do use a forwarder), so I’m not sure how long it’ll be before I can really use SPF to filter incoming email. I do provide SPF information about me to others, so that people who use SPF can throw away some of the email that’s forged to look like it came from me.
Currently I automatically drop nearly all email error messages (basically the “email failed to be delivered” messages) unless it has my reply indicator or ham password. This definitely has disadvantages; this means that if an email I send fails to be received for some reason (say because of a bad email address), and the error reply doesn’t include the ham password or reply indicator, I have no way to know about the failure. But I must do something, because spammers forge my email address so often that I would probably receive tens of thousands of error messages a day if I didn’t do it. I typically include my ham password in messages I send to people for the first time; when I do that, since error messages typically quote the message that caused the error, I’ll typically get the error reply. One complication is that some error messages will reply with the original message in the body and not the subject line; this means that a good approach for getting valid error messages is to examine the body (as well as the subject line) for the ham password. That would work, though it’d take more CPU time, and you’d better choose a ham password that a spammer is unlikely to be able to automatically guess (say using a dictionary in the email). You could even choose a different ham password for this purpose than your “regular” ham password(s). A more sophisticated setup could keep track of emails that were sent, and then compare them to error messages received, and accept only error messages related to a message I’ve actually sent. That presumes I only send email in a way that a single system can track, which isn’t true. I haven’t bothered to get more sophisticated; this approach works well enough as it is. Others might not want to do this, or do it differently; it’s certainly not fundamental to the ham password scheme.
If you can control the ruleset of your spam content filtering system, then implementing ham passwords is easy: just add a rule for the ham passwords, and maybe another rule for the reply indicator.
But if you can’t manipulate your spam content filtering system, you can still implement ham passwords. Most mail systems have a simple rule system that can let you search subjects (and bodies) for specific phrases, and then if those phrases are included, directly move that message immediately into the Inbox or some other folder. That’s all you need to implement ham passwords. People already use these mechanisms so that, for example, they can automatically get messages that mention a topic they’re keenly interested in, or presort their messages into different folders, or trash certain spam, so the necessary mechanism is already widely implemented and understood by many. Many systems (like Runbox) even let you prioritize these rules, so it’s easy to make searching for the ham passwords a high-priority rule that supercedes many other rules. Here are a few examples:
I read messages with the ham password first. This is easy to do; just search for subject lines that contain the ham password. If you raise the priority of email that has the ham password, the reply indicator, or is from someone you know (e.g., someone in your address book), then it’s much easier to quickly review the other messages that were merely accepted by the spam filter as non-spam. In my situation, most of them are spam too, but it’s as least possible to review those quickly and bulk-delete them as spam once you’ve reviewed them. I actually don’t instantly delete messages that are determined to be spam; I hold them in a separate folder in case there’s an important message that someone mentions to me soon after. But I must delete the spam soon after; there’s just not enough space in the world to store an endless stream of spam.
Another variation is to automatically add to your address book any sender who includes the ham password, and then allow anyone in your address book to be automatically accepted. I don’t do that, because there are so many spoofers in the world. Once SPF becomes more common, I may go ahead and do that; many other people seem happy doing that.
In theory, you could actually require strangers to include the ham password. But that’s really extreme, and I haven’t found that necessary; simply prioritizing messages has been sufficient for me, since it enables me to let the spam content filtering system automatically delete messages with less worry that important messages will be dropped. If you’re a stranger, I expect you to use my ham password; if you don’t, I’m likely to see the email but you run a much greater risk of getting your email thrown out, and you may not be seen as quickly (since I read ham passworded email first).
You don’t need to make it easy to get the ham password at all. Make it a complicated riddle, or only put it on your business card. But if you make it hard, people may not send you email you do want.
I can certainly imagine that it’d be very useful to create a new email header “Ham-Password:” with a ham password value, and look at it for a ham password. Mail clients could then fill in ham password values for each email address they send to. For this to work, ham password values would need to be harder to guess; spammers could include dictionaries of ham password values unless there was some bound to the number of ham passwords. But there are no clients today that support a ham password header, and I don’t have time to start a campaign to do so. I just ask people to put ham passwords in the subject line. While using the subject line is low-tech, it works so well that it’s hard to worry about making the approach fancier. If you automatically accept email from people you’ve had email with before (automatic whitelisting), there’s little need to do so, especially if you use SPF or DomainKeys to authenticate the domain of the sender.
Many anti-spam systems don’t handle mailing lists well, but ham passwords can handle them quite nicely. (An earlier version of this article said that ham passwords didn’t handle mailing lists, but on further reflection I’ve realized I was wrong; ham passwords actually work quite nicely.) One interesting factor with mailing lists is that there are actually two email receptions that need to be controlled:
A mailing list can continue to use whatever mechanisms it uses to prevent spam from being sent to its members. A simple and common approach is to require that only members can send to the mailing list, for example.
However, a mailing list is a recipient, and thus could require the use of a ham password before it accepts a message to be sent on to its members. I’ll term this the “posting ham password” to distinguish it from other possible uses of ham passwords by mailing list recipients. As with all ham passwords, this could be in the subject, another header, or the message body, though the subject seems reasonable enough. Information about the posting ham password could be distributed in the same way it’s distributed for individuals, e.g., as a picture and text description on a website. The mailing list administrator would need to initially set up the posting ham password, but would only need to change it when a spammer abuses it.
It would be wise for the mailing list to strip away the posting ham password before sending the message on to recipients. This does mean that any digital signatures from the individual recipient that cover the data will fail; one solution is to include the posting ham password in a header (such as the subject), and attach the actual signed message so that only the attachment is digitally signed.
Many combinations and variations are possible, which solve many typical mailing list problems. Each mailing list can set up the posting policy that’s best for it. Some mailing lists want strangers to be able to send to them without becoming a member, or allow subscribers to send from email accounts that are not the accounts they receive email on.
Here’s a plausible posting policy that allows posting by non-subscribers, yet protects members from spam; as an example of what you could do:
If you make ham passwords optional (as I do), then you don’t need to do anything to receive email from a mailing list. Hopefully, the information from the mailing list you’re subscribing to won’t look so much like spam that your filters will trap it. Frankly, if the mailing list allows anyone to send to it, that’s probably the best solution, since there probably is spam in it.
However, if the mailing list doesn’t normally include spam, there are other options. There are at least three ways that a ham password user can directly handle mailing lists to make sure they get their mailing list email: whitelisting, a list-defined reverse ham password, and a member ham password (having the mailing list include individual ham passwords with each messsage it sends). Each option is described below.
A “whitelist” is the simplest and most traditional approach. Here, before you add yourself to the mailing list, you modify the list of emails you’ll always accept email from (the “whitelist”) to include the mailing list’s address. If the mailing list always sends mail from a specific email address (say in the “From”, “Resent-From”, and/or “Reply-To” headers) this works well enough. Combining this with a simple authentication system (like SPF or DomainKeys) would be especially effective. If the mailing list address changes, then you’ll need to edit your whitelist, but mailing lists rarely change their address. Some mailing lists don’t tell you ahead-of-time what address they’ll send email from; this is a failure of documentation, and a simple request to the administrator should fix that for everyone. A few rare mailing lists simply forward emails without giving you useful information on what mailing list it came from; in that case, you’ll need to convince them to add that information or use a different approach.
Again, there are many ways that the member ham password could be provided in the message, such as via the subject line, a special Ham-Password header, or in the body. Using the subject is by far the simplest approach, while including information in the body could increase the workload of the recipients.
There’s also the question of whether or not the messages sent by the mailing list should be shared. If the mailing list sends a separate message to each recipient, then each recipient need not see the ham password of anyone else, a good thing. But using separate messages for each recipient will greatly increase the workload of the mailing list, a problem for very large mailing lists. The alternative is to include a long list of ham passwords in each message; this is probably impractical in the subject line, and annoying in the body, so this cries out for a special ‘Ham-Password’ header. Intermediate approaches are possible, where a single message is sent to a number (say up to 100) of the mailing list members, and multiple messages are sent out until the entire mailing list is covered.
There are two variations for member ham passwords that I see: receiver-selected, and mailing-list-selected.
Member ham passwords requires changes in the mailing list software, and more administration work to control them. These are pretty significant disadvantages, and their pain doesn’t currently seem worth the gain. Sending email with individual ham passwords is certainly possible, but currently the other approaches seem sufficient. But if in the future they aren’t sufficient, then these seem to helpful for countering those future potential threats.
Suppliers sometimes want to add their customers to general mailing lists, and those can be handled as described above. But sometimes suppliers want to send automated notifications specifically to just you at a later date using an automated system. A common example is a shipping confirmation from a pre-existing order. This can be handled just like any other interaction; if you send an email to them first, your system could whitelist them. Alternatively, the supplier should tell you what address they’ll send emails from, and you could whitelist that.
For most folks, these are sufficient, and you don’t need to use ham passwords. But are there ways ham passwords could be used here too? Sure.
A small supplier could tell you their single (constant) reverse ham password. For example, a supplier could say “We always include [SUPPLIER-NAME] in the subject line when we send you notifications.” If you add that marker to your list of acceptable ham passwords, then you’ll get the email. But that won’t work for large suppliers; if Amazon.com had a single reverse ham password that everyone accepted, then spammers would start using it too.
A better solution for large suppliers is for the supplier to have a long unguessable secret key. then the supplier can combine the key and the receiver’s email address using a cryptographic hash function, and tell each receiver “when we send a message to you, we’ll include this marker in the subject line”. as long as the key is kept secret, this foils spammers; the reverse ham password will be different for each combination of sender and receiver.
You could also maintain a list of ham passwords, one for each recipient, just like the member ham password selected by the recipient for mailing lists. This would mean that receivers wouldn’t need to keep adding new ham passwords that they accept. The disadvantage is that this would require much more administrative work to store and permit administration over all those ham passwords, and a spammer might be able to break into the sender’s system and get all those general-use ham passwords.
Ham passwords can’t be kept secret in the absolute sense, because you want strangers to get them. But once the spammer gets a ham password, you can just switch to a new ham password (and a new reply indicator, if you use one). You can even have a transition time where you accept both the new and old ham passwords, and later on drop support for the old one. The goal here isn’t to make it impossible to get the ham password -- just to make it too expensive for a spammer to send messages to you. Typically spammers use automated techniques to acquire email account names, and this approach raises the bar by making those automated techniques less effective.
Basically, the goal is to make it so much work for the spammer that they’ll stop sending spam to you.
It’s not foolproof, but it does really help, and I’ve been using it (and mentioning it on my website) for a long time.
Ham passwords can be used in isolation, but I think they’re best used in conjunction with other techniques. In particular, ham passwords are great at countering the biggest problem with spam filters: incorrectly labelling good email as spam. Ham passwords make it possible for me to rely much more heavily on my filters, because if a stranger really wanted to contact me, he or she would use my ham password... so I don’t need to review the stuff labelled spam at all.
A great advantage of ham passwords is their simplicity. They require no changes in the software of the sender, receiver, or infrastructure.
This approach is similar to disposable addresses, passworded addresses, and “watchword” email addresses, where password-like information is encoded into the email address itself. See Spamgourmet’s advanced system for an example. However, once someone starts a dialogue with you, you often want to continue that dialogue. If the address has a limit on the number of receipts, then eventually that number is exceeded -- which you didn’t want. If it doesn’t have a limit on the number of receipts, eventually that email address will leak out to others, and now you’re getting spam. A trusted whitelist helps keep existing dialogues continue, but it doesn’t help all that much when strangers want to send you ham. Worse, because these systems must be easy for humans to use, it’s usually pretty trivial for a spammer to guess what your “real” email address is (or try all likely combination). Fundamentally, email addresses were designed to be public and they’re hard to keep private. It’s a lot like logging into a computer that has usernames but no passwords -- it’s better than nothing, but usernames are hard to keep secret from automated systems that search for them. And finally, none of this works all that well with typical infrastructure -- you have to modify your systems in many cases for these to be effective. These are actually reasonable approaches, and many people are happy with them, but I think ham passwords are a reasonable alternative.
This approach is also similar to Hashcash. In hashcash, the sending computer (at least for a stranger) must do extra work to compute a value that the recipient can easily confirm. The difference is that in hashcash, a tiny amount of extra work is done by the sender’s computer, while in ham passwords, a tiny amount of extra work is done by the human sender (who has to type in the hash password). As long as this extra information is only needed for strangers to start a conversation, both approaches are quite reasonable. Hashcash’s advantage is that you don’t need to know anything about the recipient other than the email address, while for ham passwords, you now need to know the ham password. But unlike hashcash, ham passwords are easy to implement without changes to the sender, receiver, or infrastructure software, Ham passwords don’t excessive computing resources, either, a problem if senders have low-performance machines and spammers have tens of thousands of 0wned machines doing their computing for them. I actually like hashcash, but I get most of the benefits of hashcash using email passwords without any of the headaches of an approach that isn’t yet widely adopted.
Unlike economic models (that charge for “first class postage”), this approach doesn’t require complex infrastructure changes, or a tollbooth, or allow large organizations to take control over everyone’s Internet email.
Habeas had a neat idea: create a copyrighted work, and if it was included, it’s ham. Problem is, many spammers are happy to do illegal activities, and as a result, most messages with the habeus mark are actually spam. Habeus plans to create a whitelist of organizations, but really, it’s just another whitelist at that point. The problem is that the Habeas text is a constant known by all (including spammers). With the ham password approach, every single person has a different ham password, controlled by the recipient, so just including a single text string won’t do any good.
Challenge-response systems have their advantages. But some senders don’t like them, and there are so many spoofers that you’ll easily end up sending unwanted messages with one. They’re also a challege to set up. Ham passwords, in contrast, are easy to set up.
In many ways, ham passwords aren’t really a contrast to other approaches as much as a technique that augments other approaches. When combined with spam filters, ham passwords counter their weakness -- incorrectly labelling ham as spam. When combined with domain authentication (where the ham password lets you receive email anyway), they deal with their problem -- that sometimes you want to allow anonymous email such as “tips” from whistleblowers, yet not allow spammers to get through with a simple forged email address.
Ham passwords aren’t perfect. Not everyone will include a ham password; thus, I don’t require them from strangers (though I could). But because I don’t require it, spam still gets through my spam filters, which is terrible. Nevertheless, because of ham passwords, I’m willing to make my filters much more rigorous; I can make my filters unusually strict, and I simply never even review the email that my initial filter claims is spam. Basically, ham passwords have made it possible for me to use email without worrying that I’m losing critical email from strangers. If it was critical to the stranger, they’d use the ham password.
Originally I called this approach an “email password”, since it was the first name I could think of. But accessing email often involves many passwords, so wasn’t a good name; I even acknowledged that it wasn’t a good name in my original online paper. I considered using the term “hamword” instead, but a Google search revealed that the term “hamword” is often used in technical anti-spam literature with a slightly different meaning. And “hampass” is easily mispronounced in an undesirable way.
On November 16, 2004, I settled on using the term “ham password”. Ham passwords are little passwords created by the receiver that senders can use to prove that their content is “ham” (as opposed to spam), so combining “ham” and “password” seemed reasonable.
Since anyone can throw away any email message they receive, there’s absolutely no law against using ham passwords. And since reply indicators are simply a string of text that others can include (or not), I see not reason for them to cause legal troubles either.
In the long term, I believe that governments must step in and make spamming illegal, with serious legal teeth (including possible prison terms and financial bankruptcy) for the spammer and the people who fund them, as well as making spamming itself illegal (instead of just fraudulent headers and spamming after a “please stop” message). Spamming is a massive denial-of-service attack against people’s email accounts, and it’s basically a theft of service. After all, it doesn’t matter if you own an email account if you can’t use it. But the history of computer crime law shows that this takes time; it took a long time for laws to be written to criminalize computer crime, and it’s really still in process. Current laws are completely ineffective; that’s especially true for the U.S. law, since it requires opt-out of a mailing list (an approach already shown to be worthless) instead of opt-in approach to mailing lists. But that’s not surprising; lawmakers often try to make small steps in the hopes that they will incrementally ratchet things up until they solve the problem. Eventually laws with real teeth will have to be written, or email will be useless. Since legislatures like to use email, and have constituents who will throw them out if their email is made useless, in the long term I think the spam epidemic will be addressed.
But in the short term, we need to use email in spite of inadequate laws. Approaches like ham passwords will hopefully make it possible to keep using our email until the legislatures around the world catch up to the technology.
If you’re curious, you can also see my essay on stopping spam and my paper on guarded email; they both include many references to other anti-spam techniques. There’s a long list of ideas for countering spam, and I think a combination will probably be best. Besides SPF, one approach that looks very promising is Hash Cash, which uses a computational “stamp” to give evidence to others that you’re not a spammer.
I’ve also written several articles on other topics you may find interesting, including writing secure programs and quantitative reasons why you should consider open source software / Free Software when acquiring software. Many in the information technology community find my paper on key software innovations to be interesting. And of course, you can always visit my home page.