In a world where antivirus companies simply can't keep up with the number of malware samples submitted to them in a given day, it is becoming crucial that organizations have their own malware analysis capabilities.
Defining Lab Scope
The scope of the malware analysis lab can be defined by examining the processes that will occur within it. There are really two main tasks that occur within a malware analysis lab: behavioral analysis and code analysis.
Behavioral analysis involves executing a malware specimen in a controlled environment. Within this environment you should have all of the tools necessary to simulate the services the malware will try to interact with. This might include things such as a simple honeypot, an IRC server, or a web server. In addition to this, you should have tools in place to monitor the actions the malware takes when interacting with these services. This means file system, registry, and network monitoring software.
Code analysis involves disassembling and reverse engineering the code of the malware. This can be done in a static state where the code is analyzed without being executed, or in a dynamic state where the code is examined as it is being processed by the system.
These phases are very different but are both essential for performing a thorough analysis. If you have more of a systems administration background you will most likely spend a great deal more time performing behavioral analysis, where as a programming background might tilt you towards spending more time doing code analysis. Your malware analysis lab will typically reflect your preferred analysis type.
Operating System Considerations
Malware behaves drastically different depending on the operating system it is executed on. Some malware may only function on Windows Server based operating systems; whereas other malware may only work on specific Linux kernel versions. Malware that is installed on a Windows Vista host may crash the system completely, while the same malware installed on a Windows 7 system might join a botnet command and control channel. It's a necessity to have a variety of operating systems available when analyzing malware.
At a minimum you should have access to all of the major Windows operating systems and one of the more popular modern Linux distributions to serve as infected hosts. These are the machines that you will actually install behavioral and code analysis tools onto so that the malware can be executed and examined.
In additional to the infected hosts you will need at least one machine configured with various application servers so that the infected machines can interact with it. For instance, a malware specimen may attempt to communicate with another host via IRC, so you will need a host running an IRC server that you can redirect the infected host to so that communication can be examined. For this, I recommend Remnux. Remnux is a customized Ubuntu-based distribution released by Lenny Zeltser for his SANS Reverse Engineering Malware course. You can read more about Remnux or download it from Lenny's website at here.
Extra care should be taken regarding the location of malware analysis hosts on your network. Worms and other types of malware can be self-replicating, so it's highly likely that simply running an executable on a networked machine can lead to other hosts on that network being compromised if they aren't patched for the exploited vulnerability or if a 0-day exploit is being used.
Isolating your malware analysis hosts from other computers in the network is often not enough. Typically, you should isolate them from the Internet as well. The first reason for this is that some malware may be compiled such as that it launches a denial of service attack against another host on the Internet when executed. The last thing you want is an angry call from another organization claiming that a host on your network is attacking theirs with a barrage of packets.
An additional argument for isolating lab machines from the Internet is to prevent the malware author from knowing that you exist. It's entirely likely that the malware you are executing is configured to "phone home" to a command and control server that lets the author know you've executed it. At this point, the attacker could begin executing commands on your lab system in attempt to disable it or thwart your analysis attempts.
With all of those concerns taken into consideration, malware analysis hosts should be completely isolated from the network. This is best achieved by air gapping the lab systems such that they aren't plugged into any network at all.
Physical vs. Virtual Labs
In some cases you may not have the funding or availability to purchase multiple workstations to serve as lab hosts. If this is the case, virtualization software can really save the day. The use of virtual machines for lab hosts also has a few other advantages which are often overlooked.
Virtualization software allows you to save the state of a virtual machine as it runs so that you can revert back to it when necessary. Snapshots are your bread and butter when it comes to malware analysis because they allow you to revert a host back to a clean state or a prior state of infection. Using snapshots you can have a base virtual machine that contains an operating system loaded with behavioral and code analysis tools, infect that machine with malware, and save a snapshot so that you can load the initial infected state at will. When you are done examining the malware specimen you can choose to save or discard that snapshot and revert back to a clean image. It's common to have at least a couple of snapshots for each malware specimen examined, with more complex specimens resulting in the creation of dozens of snapshots.
Rapid OS Deployment
Using virtualization software allows you to build and store a library of virtual machines so that accessing any operating system you desire is only a couple of clicks away. Using this strategy in combination with snapshot technology should mean that you are never caught without the operating environment you need to perform a thorough analysis.
The more common virtualization platforms such as VMWare Workstation or VirtualBox provide advanced networking options so that you can create segmented or isolated networks with their own address ranges and DHCP servers. This makes isolating infected hosts a matter of a few clicks of the mouse.
Although this benefit is commonly overlooked, there is a certain value to have standardized hardware when performing malware analysis. This will ensure that your results are repeatable and consistent.
As malware authors become smarter they are beginning to build in routines to detect whether or not their binaries are running within a virtual environment. Once detected, the malware may change how it functions or not function at all. That said, there are a few things you can do to trick malware into thinking that it's running on a physical host rather than a virtual machine. You can read more about some of these techniques at here.
Taking everything we've discussed thus far into consideration, it becomes fairly easy to draw out the architecture of a simple malware analysis lab.
Figure 1: Sample Architecture
In the figure above I've outlined both a representation of a simple malware analysis lab. In this architecture, only a single physical host is used, greatly reducing hardware costs. The operating system of this host doesn't particularly matter, although I prefer using a Linux host because it is less susceptible to a great deal of malware. This physical host is running either VMWare Workstation or Sun VirtualBox to host the virtual machines that will make up the lab.
Logically, each of the virtual operating systems can be loaded into one of many isolated virtual networks along with a Remnux virtual machine. These VM's are isolated into their own personal network so that they can communicate with each other and nothing else.
This architecture is very basic, but it should fit the needs for a wide variety of folks.
Although it seems intimidating, setting up a malware analysis lab is actually quite simple and can require a minimum amount of hardware. If you have an interest in learning more about malware then the best thing you can do is setup a lab of your own and start doing analysis.