IPSec (IP Security): Part 1 – ESP


If you would like to read the other parts of this article series please read:

IPSec or IP Security

IPSec is a topic which, when broached, often elicits blank stares and, or often, puzzling comments on it. This area of computer security and protocol usage is one that definitely bears further scrutiny as it impacts both home users and corporate users. You may already be using it and not be aware of that fact. In an effort to demystify the area of IPSec I shall discuss various aspects of it. Some may disagree that all of the following topics fall under IPSec but they will be covered here; ESP, AH, IKE, and GRE will be looked at.

Also looked at will be how these various schemes work. Further to that we will also look at some specific packet examples of ESP (encapsulating security payload) as it would appear on the wire and elaborate on why analysis of this type of traffic from a security perspective is near impossible. Also looked at will be a GRE packet, and the often confusing topic of IKE (Internet key exchange) in addition to its role in computer security. By the end of this article series you will be far better informed on what IPSec is and how it impacts you. This is important seen as you may be tasked to purchase a VPN solution for your company in the near future sometime. The knowledge contained in this article will thereby allow you to make a far more informed decision.

ESP is not only for the movies!

We are indeed talking about Encapsulating Security Payload and not Extra Sensory Perception. ESP is one of the better known facets of IPSec and is readily recognizable when viewed at the packet level, as you will soon see. This protocol came about as a way of securing information in transit between two end points and some people refer to it as a tunnelling protocol. This is due to the fact that the information is encapsulated in ESP before it is sent out over the wire. It can also be used in both IPv4 and IPv6, and a little known fact is that ESP can be used in conjunction with AH (more on AH later). This is why many people say that IPv4 is inherently insecure. Well that is very much an arguable statement now isn’t it.

ESP itself can be used in two separate modes which are known as “tunnel” and “transport”. In tunnel mode the IP datagram that is to be sent is put in the encrypted part of ESP and then the whole ESP frame is in turn put into a datagram whose IP headers are unencrypted. Sadly there is no simpler way of explaining this, but the key parts to pull out of this are that the original IP datagram that is to be sent is put into an encrypted part ie: ESP which in turn is placed in another datagram that has unencrypted IP headers. Then there is what is also called transport mode. What happens in transport mode is that the ESP header is inserted right after the IP header in the packet itself. This mode is by far simpler to understand! For a nice diagram that will highlight the both of them I encourage you to look here. Take a quick look at it for it will certainly help clarify the two various means. The more senses (visual, read and so on) that you involve when trying to learn something the better.

Behold the ESP packet!

We will now take a look at an actual ESP packet and go through some of its metrics. Also I will comment on why this type of traffic is so hard to analyze as it impacts network security. Please note that I will comment on the packet directly beneath it.

00:00:03.831546 192.168.1.100 > 192.168.1.200: ESP(spi=0x14579c09,seq=0x4926) (ttl 243, id 9712, len 1072)
0x0000   4500 0430 25f0 0000 f332 94e8 c0a8 0164        E..0%….2…{..
0x0010   c0a8 01c8 1457 9c09 0000 4926 67f3 2e95        …..W….I&g…
0x0020   6804 f49a a7e6 e6c5 4fd8 7b7a c2b0 1575        h…….O.{z…u
0x0030   dbdd a425 2d73 9565 0b13 0273 53dc c6b3        …%-s.e…sS…
0x0040   9301 eb2b 3d29 f85e 2b81 799c ec07 1e80        …+=).^+.y…..
0x0050   08fb cf16 9cea 3263 3d46 55f6 f070 a6f0        ……2c=FU..p.
0x0060   4029 0453 4707 19cc 0212 5d33 36fa 134a        @).SG…..]36..J
0x0070   d640 690c 01f6 ac9c 3818 1da5 becb 2baa        .@i…..8…..+.







I will very quickly cover the metrics that I have mentioned before in other articles. From left to right we have our timestamp, source IP address, and source port. This is followed by the destination IP address and destination port. After that we are actually told via the “ESP” seen above that this is an ESP packet. Quite nice! After this we have “spi” (security parameter index) and the number after it. This is the arbitrary number assigned which identifies the SA or Security Association for this packet. Next up is the “seq” and the hex value, which is the sequence number. This value can be used to prevent replay attacks. This option will be decided during the SA. After this we have our normal IP header values of “ttl” time to live, “id” or IP ID number (used for fragmentation purposes), and “len” for overall packet length.

Before I forget I should mention that I have truncated the above packet a little. Thought I should mention this in case you are counting up the bytes and comparing it to the overall packet size and notice a difference. We can see from the underlined portions of the packet that this is where the encrypted portion starts. Also you will notice that there is no discernible information in the ASCII content of the packet. Well that is because the original datagram has been encapsulated within ESP. Due to this it is very difficult to try and do any type of meaningful analysis on this type of traffic. About all you can realistically do is statistical analysis. By that I mean you are suddenly seeing an upsurge in ESP traffic whereas you normally only have very little. This would be an obvious indicator that there may be a problem. ESP itself is not immune to attack and that is why certain options are chosen during the SA for the connection.

The SA or security association is done using IKE or Internet key exchange. It should be noted that an SA is good for only one direction ie: the sender. If a VPN connection is used, then two SA’s are required, one for the sender, and one for the receiver. An SA is composed of three different values. The SPI (explained above), destination address, and the security protocol ie: 50 or 51 as noted in the IP header.

We will wrap up the article at this point. So far we have covered what ESP is and how it works at a high level. We also noted why it is almost impossible to do fruitful analysis of ESP packets. Furthermore we also covered what an SA is and how it is part of IKE. In the second part of this article series on IPSec we will take a deeper look at IKE to see how it goes about setting up ESP and AH connections. There are various ways that IKE can go about this and we will learn about them. Stay tuned for part II!

If you would like to read the other parts of this article series please read:


About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top