Why Is Email So Complicated? Part 150: We Don't Know Who It's From
A bell rings, or perhaps an annoying voice says "You've Got Mail." More often than not, the first question you ask is, "Who's It From?" How (or even if) you react to a message depends tremendously on who you think sent it. Unfortunately, it's almost impossible to know for sure.
One of the reasons for the Internet's success -- and one of the things that governments and intellectual property owners find most annoying about it -- is its openness. No one owns the net; people just agree to interconnect their machines over communications networks, and to send and receive information, mostly but not necessarily using standard formats and protocols.
Those who disapprove of openness may yet triumph, but even then, the legacy of openness will complicate any attempt to make it easier to be sure of the source of an email message. Because email is sent between independent consenting peers, with no ability to enforce anything, our email protocols have been designed as if everyone were equally trustworthy, which has led to an almost universal lack of trust.
So when you get a message claiming to be from me, and it says
From: Nathaniel Borenstein <[email protected]>
what do you actually know? The two simplest answers ("nothing" and "it's really from Nathaniel") are demonstrably wrong. In fact, you have nothing like certainty, but you may have dozens of clues, and you may be able to deduce some intermediate conclusions with high confidence.
Most of the clues are in the hidden part of the message header that you rarely see. Even the most basic "unauthenticated" email will have some clues, because modern mail transfer agents log the IP address from which they received the message. So if you're a colleague of mine at Mimecast, but the message headers say it came from an external IP address, you have reason to be more skeptical than usual. On the other hand, if the message says it never left your organization, that's something you can usually rely on very strongly. But there are always exceptions. I could legitimately send mail from my corporate email address while using a Blackberry, gmail, a web site, etc., and this valid mail would look suspicious.
So this means you have to start asking questions about those external IP addresses. Some of them are reasonable to be sending mail from me, and most are not. Unfortunately, you're still not done; there are various ways that the log address could be faked, or that legitimate sites like Blackberry might be tricked into sending out illegitimate messages in my name. Certainty recedes ever further into the background.
The leading approach to breaking the cycle is cryptographic authentication. Cryptography can be used to authenticate an individual email sender (PGP, S/MIME) or the sending domain (DKIM). The former has been around far longer, but the latter is seeing better adoption because end users don't have to do anything. With the help of such authentication, we can deduce new and useful facts about messages, e.g. "the message definitely came from mimecast.com."
But what if sleuthing reveals the relatively definitive information that a message came from hitherto-unknown AbsolutelyTrustworthy.com? Does that mean you should jump at the opportunity to buy a magic potion, or wire money to Nigeria? Probably not, but it might mean that your cousin has opened up a new web store. There is no clear line separating the most subtle scam from the crudest marketing.
What's needed next is a way to ask trusted authorities -- whoever they are -- whether or not a given domain should be trusted. Domain reputation standards don't exist yet, but we're working on them at the IETF. This will help, eventually, but plenty of challenges and complexities will still remain.
So, when next you check to see who your new mail is from, remember that you can't expect more than a best guess. It will be right the vast majority of the time, but that doesn't mean it's simple. It means that an amazing amount of work is being done to analyze every single message you receive.
Beyond sender and domain verification, authentication mechanisms in the future may be used to verify a sender's role or other secondary factors ("is over 18," or "AARP member"), or even the payment of money associated with the message. I'll have a lot more to say about authentication and trust in future installments in this series.
Of course, that's still oversimplified.... Nathaniel Borenstein <[email protected]>