Proper encryption ensures confidentiality, but what about integrity?
Authentication is the appropriately crucial manner of addressing this.
Alice and Bob - the ambiguous crypto-couple. It's no surprise that they've seen their fair share of scenarios; after all, they are the most publicized fictitious duo, this side of mathematics. Once upon a time, they decided to establish a channel of communication, and with their pre-shared, agreed symmetric key, using a relatively secure block cipher in a decent mode of confidentiality, they began to converse. They did so under the assumption that Eve, the eavesdropping counterpart, would discover the fact that they were communicating, but be limited in exactly what knowledge she could obtain about that conversation. With their assumption of confidentiality, these limitations include restricting Eve to only knowing that they are communicating, who is doing the communicating, how much communicating is taking place, and at what time they're performing all of this communication.
A MAC - what it is and what it does
Alice and Bob feel relatively safe in knowing that their level of confidentiality frustrates Eve's attempts at divulging their confidential communication, but what they don't know is that Eve is still trivially capable of performing a monumentally devastating act that confidentiality doesn't provide protection for - manipulating their conversation. In their alleged secure channel, they've failed to realize that Eve's ability to alter their conversation, or breach its integrity, is likely much more detrimental than merely being able to passively read it, or breach its confidentiality. Fortunately, their friend, Charley the Cryptanalyst, agreed to take a look at their secure channel parameters and give them the heads up on whether or not it passed the muster. Once he notices they have no authentication mechanism, he shares the knowledge of this vulnerability of high-potential insecurity with Alice and Bob; he recommends using a Message Authentication Code, or MAC, as a solution to integrity.
Now that we've established our scenario, let's get down to defining the general construct of a MAC and its purpose.
A MAC, or Message Authentication Code, is basically founded on the concept of rendering your conversation tamper-proof; it's a method for providing message integrity. Where confidentiality is the act of preventing Eve from reading our conversation, integrity ensures that Eve won't be able to alter our conversation, which is much more important to us. After all, keeping a secret is great, but only if we know who we're keeping a secret with and if that secret remains intact between Alice and Bob. This is what message integrity intends to provide assurance for. A MAC, just as a symmetric algorithm, uses a secret key, denoted by K, which both Alice and Bob are certain of, but Eve is not. When communicating a message, denoted by m, Alice uses the secret key, K, to compute a code with the general function of MAC(K, m); during transit, she sends this computed code along with the message. The code, or "tag," and key are of a fixed length, while the message is of arbitrary length.
The Horton Principle and Partiality Towards Authentication Over Encryption
However, Eve can still mount replay attacks at will, so we can add some extra data, denoted as d, which contains information such as a message identification number, and various other information, such as where the data is being sent from and where it's intended to go, et cetera. This is just one sign that applying a MAC isn't as easy as you might think; there are certain intricacies to address. Now seems like an appropriate time to introduce the simple concept behind the Horton Principle. The concept this principle proposes is simply suggesting, "Hey, how about we authenticate what we mean, instead of what we say?"
There may be a situation where Alice authenticates a message which is a concatenation of multiple data fields, and Bob must know the appropriate parsing information necessary to decompose that concatenation into its separate data fields, otherwise, it's possible for illegitimate data to be authenticated, by Bob's accident or Eve's attack. So, to address this with the Horton Principle, we not only want to authenticate the message, m, but also the parsing information for m. View it as a means of providing semantics for interpreting data with the intended context, rather than just sending and receiving generic bytes of data. This aims to ensure that what we mean is authenticated, rather than what we say.
The precedence of authentication over encryption, and the order of both
Earlier, we made it clear that there are situations where manipulating the data is more detrimental than merely divulging it, so that suggests that authentication may hold a certain precedence over encryption. Interestingly, it does. In these general situations, authentication is more salient, hence why we design our system's attack model with more partiality to it. So, how do we demonstrate this partiality? Authentication before encryption. Besides, this directly upholds the basis of the Horton Principle, which is one of our design strategies we'd like to satisfy. Because of this partiality, we'd rather have Eve attack an outermost layer of encryption, rather than an outermost layer of authentication.
Of course, this order isn't the only secure configuration; there are schematics in which a secure system can result from encrypting first, then authenticating, second, as well. This approach just happens to appropriately satisfy our design strategy under the Horton Principle and the concept that abuse to authentication, or the lack thereof, is more important to our model of a generic secure channel; it satisfies this with the simplistic security we desire in any design. Realistically, this isn't necessarily a one-size-fits-all hat; it takes assessment of a system, at the nitty-gritty level, to arrive at a reasonable sense of where authentication fits into your threat model. Don't be afraid to be conservative, a little paranoid, and take into consideration even what may seem to be only a remotely practical threat. But, as far as threat models go, I'll open that can of worms later [in another article]; it's the precursor to the design phase.
Looking at things from the opposite angle
Having said that, I'm going to turn completely around and briefly suggest a defense for the case of encrypting first, then authenticating, second. That's right. Ready? Alright, these following terms correlate to cryptanalytical theory that is far beyond the scope of this article, but they will prove to be invaluable to any academics who may peruse through this and find themselves fiending for a more extensive portrayal of the subject matter. As such, given the constraints of this article, and the general audience, I won't actually provide the mathematical foundation; that's up to the reader to explore further, and I'll be more than glad to assist them in doing so.
Compositions and notions of security
So, without further ado, let's lay these terms on the table. There are three compositions for encryption and authentication that are usually referenced: Authenticate-then-Encrypt (AtE), Encrypt-then-Authenticate (EtA), and Encrypt-and-Authenticate (E&A). I'm only going to focus on the first two: AtE and EtA, where the former is what we've already discussed in the bulk of this article, and the latter is a composition that, in many cases that I've encountered while analyzing a system, is the more responsible choice. Yes, I've spent the majority of this article advocating the principles behind AtE, and now, at the end, I'm suggesting that EtA is often the better, more simplistic, choice.
With AtE, you must be extra-meticulous, as there are fragilities that aren't forgiving, and even the most subtle of changes can shift the scheme from secure to insecure. Remember, however, to be meticulous when instantiating an authentication scheme; that's a given - with either AtE or EtA. Both compositions can be made secure and insecure; this has been demonstrated in practice. In the end, you may find that EtA is often the simpler choice, and makes a decent, de facto, default composition for newly deployed systems. In fact, a responsible suggestion, in my opinion, based on research results, is to authenticate the ciphertext of an IND-CPA secure encryption scheme, with a SUF-CMA MAC; this renders IND-CCA2 security and achieves INT-CTXT, which are desirable properties. (Also, consider relationships between notions of security, such as implications between indistinguishability and non-malleability, for example.)
While this is just one opinionated philosophy, there is one echo emanating from every approach to designing a secure channel - you need both confidentiality and integrity. If you find yourself habitually applying authentication wherever encryption exists, then hats off to you.