2.4. Greylisting

The greylisting concept is presented by Evan Harris in a whitepaper at: http://projects.puremagic.com/greylisting/.

2.4.1. How it works

Like SMTP transaction delays, greylisting is a simple but highly effective mechanism to weed out messages that are being delivered via Ratware. The idea is to establish whether a prior relationship exists between the sender and the receiver of a message. For most legitimate mail it does, and the delivery proceeds normally.

On the other hand, if no prior relationship exists, the delivery is temporariliy rejected (with a 451 SMTP response). Legitimate MTAs will treat this response accordingly, and retry the delivery in a little while[1]. In contrast, ratware will either make repeated delivery attempts right away, and/or simply give up and move on to the next target in its address list.

Three pieces of information from a delivery attempt, referred to a as a triplet are used to uniquely identify the relationship between a sender and a receiver:

If a delivery attempt was temporarily rejected, this triplet is cached. It remains greylisted for a given amount of time (nominally 1 hour), after which it is whitelisted, and new delivery attempts would succeed. If no new delivery attempts occur prior to a given timeout (nominally 4 hours), then the triplet expires from the cache.

If a whitelisted triplet has not been seen for an extended duration (at minimum one month, to account for monthly billing statements and the like), it is expired. This prevents unlimited growth of the list.

These timeouts are taken from Evan Harris' original greylisting whitepaper (or should we say, ahem, "greypaper"?) Some people have found that a larger timeout may be needed before greylisted triplets expire, because certain ISPs (such as earthlink.net) retry deliveries only every 6 hours or similar. [2]

2.4.2. Greylisting in Multiple Mail Exchangers

If you operate more than one incoming mail exchangers, and each exchanger maintains its own greylisting cache, then:

For these reasons, you may want to implement a solution where the database of greylist triplets is shared between your incoming mail exchangers. However, since the machine that hosts this database would become a single point of failure, you would have to take a sensible action if that machine is down (e.g. accept all deliveries). Or you could use database replication techniques and have the SMTP server fall back to one of the replicating servers for lookups.

2.4.3. Results

In my own experience, greylisting gets rid of about 90% of unique junk mail deliveries, after most of the SMTP checks previously described are applied! If you used greylisting as a first defense, it would likely catch an even higher percentage of incoming junk mail.

Conversely, there are virtually zero False Positives resulting from this technique. All major Mail Transport Agents perform delivery retries after a temporary failure, in a manner that will eventually result in a successful delivery.

The downside to greylisting is a legitimate mail from people who have not e-mailed a particular recipient in the past is subject to a one-hour delay (or maybe several hours, if you operate several MX hosts).

See also What happens when spammers adapt....

Notes

[1]

Although rare, some "legitimate" bulk mail senders, such as groups.yahoo.com, will not retry temporarily failed deliveries. Evan Harris has compiled a list of such senders, suitable for whitelisting purposes: http://cvs.puremagic.com/viewcvs/greylisting/schema/whitelist_ip.txt?view=markup.

[2]

Large sites often use multiple servers to handle outgoing mail. For instance, one server or pool of servers may be used for immediate delivery. If the first delivery attempt fails, the mail is handed off to a fallback server which has been tuned for large queues. Hence, from such sites, the first two delivery attempts will fail.