If you would like to read previous articles in this series, please go to:
The problem of sprawl
Sprawl is an ugly word. Just the sound of it turns my stomach over. Merriam-Webster's dictionary defines it as "to lie or sit with arms and legs spread out" or more generally "to spread or develop irregularly or without restraint" as in suburban sprawl, the proliferation of low-density development made possible by the consumer's use of automobiles for transportation.
Sprawl is a problem for IT as well. Server sprawl refers to the multiplication of physical servers in a server room as new servers are deployed to provide new applications and services the business needs. One way of getting control of server sprawl is server virtualization, where you perform a physical-to-virtual (P2V) conversion of each physical server's workload and then migrate these workloads as virtual machines onto a virtualization host. You can find some good articles on this subject in the virtualization section.
But in this article I want to talk about storage sprawl, another type of sprawl that often plagues IT departments in organizations of all sizes. Here's what I mean when I use this term:
- Storage sprawl is the scenario where each server in your organization has its own direct-attached storage (DAS).
As the three figures below illustrate, DAS can be implemented in three ways:
- Internal DAS - This is when the storage devices (typically HDDs but increasingly SSDs as well in certain scenarios) and storage controllers (typically RAID controllers when referring to servers) are installed inside the server system hardware (i.e. inside the box). Small businesses that have one or two servers often use this approach for their server storage, and there's usually no real problem with doing this.
- External DAS - This is when the HDDs are in an external enclosure and not inside the server. A cable connects the server's RAID controller with the external enclosure, and a storage communications protocol such as SCSI or eSATA is used to transfer information between the server and the enclosure. This type of scenario is the one where storage sprawl commonly begins to become a problem, with the number of external DAS enclosures proliferating as quickly as the number of servers in the organization.
- External DAS with built-in RAID controller - This is when both the HDDs and the RAID controller are located in the external enclosure. A special card called a Host Bus Adapter (HBA) installed in the server connects via a cable with the enclosure to enable data to be transferred between them. A storage protocol such as SCSI, eSATA, or SAS is used to enable the server and DAS enclosure to communicate with each other. This type of DAS device is usually more expensive and therefore tends to be less common. More common in large enterprises is the big sister of this type of solution, that is, the storage area network (SAN).
Figure 1: Internal DAS
Figure 2: External DAS
Figure 3: External DAS with built-in RAID controller
The main problem with DAS devices, even the expensive ones that have built-in RAID controllers, is that they are designed for one-to-one use with servers. In other words, each server in your environment will need its own separate external DAS device.
In other words, you'll eventually get storage sprawl if you stick with DAS as a solution for the growing storage needs of your business.
So how do you tackle the problem of storage sprawl? Here's the key:
- The solution to storage sprawl is storage consolidation.
In other words, you want to take the data stored on many DAS devices and move it to a single storage device. Just as there are several ways of implementing DAS, there are also several ways of allowing multiple servers to access and store data on a single storage device. For example:
- Network-attached storage (NAS) - This is when you have an enclosure containing multiple hard drives that is connected to your network. Multiple servers can then communicate with the NAS device using a file-transfer protocol such as SMB or NFS. NAS devices can be cheap, so it's easy for businesses to buy more of them as their storage needs grow, but the result is usually simply a different type of storage sprawl. In fact, NAS sprawl can be worse than DAS sprawl because you might end up with 2 or 3 NAS devices for each server on your network (as opposed to only one DAS for each server). NAS devices also compete with applications and services for available network bandwidth.
- Storage area network (SAN) - Also called storage arrays, a SAN consists of drives, controllers and switches that connect to HBAs on servers using a block transfer protocol such as Fibre Channel or iSCSI. SANs are usually the best solution if you want to prevent or address storage sprawl because they allow all of your server storage to be consolidated on a single storage device. This makes provisioning, managing and backing up storage much easier dealing with a multitude of different DAS or NAS devices.
Figure 4: Network-attached storage (NAS).
Figure 5: Storage area network (SAN).
The cost issue
Some businesses balk at the idea of purchasing a SAN because of the high up-front cost involved. However, there are some very good reasons why a SAN might be a cost-effective solution to the growing storage needs of your business:
- While your initial investment in a SAN may be high because of the cost of the chassis and HBAs, as you add more and more drives to the SAN you'll quickly reach a crossover point where the cost per GB is now roughly the same as it would be for high-quality DAS or NAS storage devices. In other words, you'll soon reach the point where a SAN is easily justifiable on a cost-of-device basis. So if you expect your business go grow significantly, or if it's already grown to the point where managing storage is a headache and you don't want it to become a nightmare, start investigating where a SAN might be the right solution for your company going forward.
- You will probably save on electricity costs in the long run as well if you migrate your DAS/NAS storage to SAN storage. DAS/NAS devices tend to consume more rack and floor space, have greater air conditioning (A/C) requirements, and have higher overall AC power requirements than a SAN does. You'll also have less spaghetti (cabling) to deal with, which can save you money too.
- Windows servers can boot directly from logical unit numbers (LUNs) on SANs. This means you can retire your old tower server systems that have multiple hard drives in them and use diskless blade servers instead. Again, this can save you AC and A/C costs over time once the initial investment for new hardware has been covered. And don't forget that newer server hardware is usually more efficient and powerful than old hardware.
- SANs make performing backups much easier because you can take a snapshot of the data. And if you archive your business data to a tape library, the SAN can handle the work of doing this, freeing up your servers from having their workload impacted during the backup window.
- People are the most expensive cost in most IT budgets. The more efficiently your IT staff can do their job, the more money you'll save for your business. Or would you rather have your IT staff running around all the time twiddling cables to try and determine why server #17 can't access data on DAS #324 or NAS #857?
The bottom line is this: How much do you value your business data? If you value it, you'll put strategies into place that will prevent storage sprawl from happening. And if storage sprawl is already a reality in your company, you need to take steps to address this today. But before we look at what steps you might take to start getting a handle on your organization's rapidly expanding storage needs, my next article will address another common problem associated with DAS storage solutions, namely, the problem of overprovisioning.