Obfuscated Shellcode, the Wolf in Sheep’s Clothing (Part 1)

If you missed the other articles in this series please read:

A wolf in sheep’s clothing

There are many threats out there today, which are of concern to the network security analyst. Some of the threats can be mitigated to a certain extent through the use of various hardware, and software solutions. A good example of this would be how you defend against that ever-present pest; distributed denial of service attack. Several vendors have come up with some good hardware solutions for this network attack. Much like other vendors have sold enterprise class firewalls to help protect the corporate network. What the two above noted examples have in common though is that a solution is provided by a third party. These solutions are easily implemented with a little vendor side training for the in-house network security staff.

This brings me to a theme that I have written about before; the training of the network security analyst. Without training your network security staff, whose mandate is the protection of your corporate cyber assets, will be woefully unprepared to carry out their duties. One very good example of how the lack of training can affect you is what this three-part article series is all about. Specifically just what is shellcode obfuscation, and how does it impact the network security analyst at your work who may not be aware of this threat.

So what’s your point?

The point of writing about shellcode obfuscation, and how it could impact a poorly trained analyst is that this threat is a very high level one. It is one which typically results in system level access, or root depending on your operating system. Most of us who are involved in computer security realize that the majority of high-level damage is caused by buffer overflows. What happens is that a program such as Apache, or IIS for example may have poor input validation in a part of its coding. This lack of input validation in turn would result in the program not checking to see how much is written into a specific part of that program’s code. The problem in essence is that this hypothetical line of code only has a certain amount of memory assigned to it. The attacker then noticing (through having disassembled the code) that there is no input validation would then simply overwrite this function’s buffer. Problem is it also spills over into another function’s buffer as well. From there the exploit does its business. Very simply put, that is what happens.

With the above noted in mind we ask ourselves, “well how does the attacker overflow a function’s buffer?” This problem is dealt with as part of the attacker’s exploit code, which has several components to it. Typically the exploit code is written in C or C++ as most heavy weight web application (such as web servers, and operating systems) are written in those languages.  Following the C or C++ component is where you will find your shellcode. There are several reasons that the actual exploit part of the code is written in ASM. Most importantly it is smaller in terms of size to write in ASM then it is in C. Also it is definitely faster to execute assembly code. What the ASM portion of this exploit contains is the end results of what the exploit developer wants; a cmd.exe, or a root shell. Now having obtained this reverse shell the exploit developer has free rein over the computer.

What is used to overflow the buffer?

You may remember that the letter “A” was used as the buffer overrun character in Code Red. The vast majority of buffer overrun characters are much like the aforementioned A, but it could be any other character as well. For example another version of Code Red used the N character for its overrun. Though to many of you who monitor an intrusion detection system you may be more familiar with the 0x90 character.

(Whenever you see 0x you should know that the following alpha-numeric characters are in the hexadecimal numbering system). This 0x90 translates to the NOP function aka “no operation”, and literally means just that when the instruction is executed by the computer’s CPU. So when the computer processes this instruction it does nothing and keeps processing until it reaches an instruction, which will have it do something. These characters will show up as part of the hex payload in a packet. In the below noted packet example the alpha-numeric characters after the 0x0000  ie: 4500 0028 are in the aforementioned hexadecimal numbering system. It is in this portion of the packet that you would notice a repeating character such as the A or N for Code Red.

01/24/2005 00:01:52.750789 xxx.xxx.xxx.xxx.45648 > xxx.xxx.xxx.xxx.25: . [tcp sum ok] ack 2196634982 win 17520 (DF) (ttl 120, id 46606, len 40)
0x0000   4500 0028 b60e 4000 7806 a451 xxxx xxxx        E..([email protected]
0x0010   xxxx xxxx b250 0019 24d7 9ce9 82ed fd66        …..P..$……f
0x0020   5010 4470 ce75 0000 0000 0000 0000                P.Dp.u……..

This hexadecimal character repeating is easy to see, and more importantly for intrusion detection system vendors something to write a signature for. Take the above example and for clarity’s sake make it look like:

01/24/2005 00:01:52.750789 xxx.xxx.xxx.xxx.45648 > xxx.xxx.xxx.xxx.25: . [tcp sum ok] ack 2196634982 win 17520 (DF) (ttl 120, id 46606, len 40)
0x0000   4500 0028 b60e 4000 7806 a451 xxxx xxxx        E..([email protected]
0x0010   xxxx xxxx b250 0019 24d7 9ce9 82ed fd66        …..P..$……f
0x0020  nnnn nnnn nnnn nnnn nnnn nnnn nnnn nnnn               P.Dp.u……..

Please note that these are not psh/ack packets with a valid payload but simply an example.

The above noted series of ‘n’s is easy to see, and in this example would be the representation of the character used for the overflow. It is this character that would overwrite the function, which has improper input validation. So the IDS vendors simply wrote a signature looking for these common hexadecimal characters repeating themselves in excess of a certain amount. Presto your IDS goes off saying it has detected a buffer overflow attempt. (Normally though there would be other criteria for a packet to meet, such as a destination port number)

Let’s put it all together

So now we know that the vast majority of exploit code has what is called a NOP sled. This NOP sled is used to give the exploit coder a fudge factor in, which to point EIP to so that the CPU will begin executing these NOPs until it hits the exploit code. Now seen that the majority of exploit code has this NOP sled mainly comprised of the hex character 0x90 it is an easy task for the intrusion detection system vendors to build a signature. This alpha-numeric string plus a destination port number is often used to form a signature.


What happens if there is no longer a NOP sled for the vendor’s signature to fire on? Well that is indeed where shellcode obfuscation comes into play. Exploit developers being the very clever bunch that they are quickly figured out that IDS vendors were making signatures based on this NOP sled. They got to thinking of a way of defeating this very effective counter-measure. Problem was they needed to come up with another instruction, which did not do anything to the computer as the CPU executed it. In essence they wanted another NOP instruction. Well as it turns out these exploit developers through testing, and research came up with a whole raft of other idempotent instructions. (idempotent meaning an instruction or opcode that will not arbritrarily affect the victim computer ie: crashing it) At last count there were about sixty, or so other idempotent instructions which could be used in place of NOP aka 0x90. At this time please note that I am only referring to the IDS vendor writing a signature based on the NOP sled, and not also say /bin/sh in the ascii content. There are other signatures that can be developed based on a packet’s ascii content.This wraps up the first part of this series. In part two I will show you how snort (opensource IDS extraordinaire) sees an exploit with a NOP sled. Till then!

If you missed the other articles in this series please read:

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top