Data Tiering and Overprovisioning
If you would like to read previous articles in this series, please go to:
What is overprovisioning?
As I indicated in the previous article of this series, another common problem associated with Direct-Attached Storage (DAS) technologies besides storage sprawl is the issue of overprovisioning. What is overprovisioning? It's basically when you buy more than you need of something.
Storage overprovisioning is common in the IT world for a number of reasons. For example, IT departments in companies that grow slowly over time often develop an organic approach to building out IT infrastructure. For example, let's say the company has expanded their sales force in conjunction with your new marketing campaign and you need another server to handle the load on your content management system. What do you do? You buy another server that has enough DAS to handle the extra load, and just to be safe you make sure the DAS has more than enough drives in it to meet the needs of your growing sales force.
Or let's say the media department in your company is tasked with a new project that involves creating lots of video files. What do you do? You buy a Network Attached Storage (NAS) device for the department so they can store these files. And again, just to be safe you make sure the NAS provides more than enough storage to meet the department's needs.
You can probably see where this is headed--it's the problem of storage sprawl all over again. See my previous article in this series for a discussion of why sprawl is bad when it comes to storing business data. But there's another element in these examples, namely the "just to be safe" thing. What does this imply?
First, it implies incompetence or at least laziness on the part of the IT department. Specifically, they haven't made an effort to come up with a reliable projection of how the storage needs for the sales force or media department might grow over the lifetime of the campaign or project. Sure, projects can be unreliable, but when money is involved the result of overspending in one area can mean you won't have enough for other needs. So it's important for IT to develop the discipline of coming up with solid, justifiable projections when they need to procure and provision storage and other resources.
Second and more importantly however might be the underlying motivation that drives IT to making "just to be safe" types of decisions such as overprovisioning. That motivation is simply fear.
In the early days of computing, data processing and storage was expensive and used mostly by the military and academia. As computing made inroads into the business world, there were early adopters were from the banking and financial industry. Now, if there's anything that banks fear more than anything else, it's losing money. As a result, the idea took root that the main job of IT was to make sure that every single bit of data was reliable and accurate always. After all, there's a big difference between having $1,000,000,000.00 on your balance sheet or $0,000,000,000.00! Heads will certainly roll in the bank's IT department if this kind of error should happen, and fear of losing your job is a good motivation for playing it safe with regard to provisioning storage in such situations.
The result was to ingrain into the very genetics of the corporate IT department the idea that data is valuable and its integrity must never be compromised under any circumstances. To make sure this happens, various data integrity algorithms were invented and technologies like RAID mirroring and failover clustering were created. Sure it costs more to store that spreadsheet on a mirrored drive, but so what? All hell could break loose if the spreadsheet were to be corrupted or lost. So "just to be safe" let's store it on a RAID 0 volume on a two-node cluster that's replicated in real time to a second cluster located at a different site...
Unfortunately once you get addicted to this "cost is no issue" thinking based on fear of the consequences of data loss, it's very hard to change. And that is the primary reason why, despite all the talk about risk management and tradeoffs and storage-as-a-commodity and so on, IT departments still have a strong tendency to overprovision storage and thereby waste their company's money.
It's the same reason I used to take two calculators into the exam room during my undergraduate Physics days--I wanted to have extras just in case one of them failed. Speaking of calculators, I was one of the early adopters of the HP-35 calculator and for those of you old enough and nostalgic enough to care, there's a terrific site you should check out: The Museum of HP Calculators at http://hpmuseum.org.
Overprovisioning and storage sprawl
Overprovisioning of enterprise storage has a close connection with storage sprawl for servers. The reason for this is because DAS arrays are implemented on a per-server basis. This means each DAS array has to have enough storage to handle the data needs of the server it's attached to. And how much DAS storage is enough for my server? The answer is usually that "just to be safe" I provision way more than enough storage than my server actually needs.
You see, the problem here is that forecasting is statistical in nature, which means the larger the dataset you're working with, the more accurate you can predict future needs (assuming all other things are equal). In other words, it's difficult to accurately predict how much data storage growth a single server running a single business application might experience over time if the data storage is utilized only by that server. And it's even harder to accurately forecast data storage growth for many servers that each have their own dedicated DAS array. The end result of a multitude of small predictions like this will likely result in some DAS arrays being highly underutilized (around 25% or less), some underutilized (around 50%), some well-utilized (around 75%), and some almost saturated (>90%). As a result, you'll have a lot of wasted storage space on the underutilized arrays, and you'll also be in dangerous territory with business applications running on servers whose arrays are almost saturated:
Figure 1: DAS storage can lead to wasted space on servers where arrays are underutilized and can threaten the reliability of applications on servers where arrays are almost saturated.
By contrast, if all your server data storage is consolidated in a single array such as a SAN, your storage is easier to manage and you can ensure that it's well-utilized at all time. Wasted space through storage underutilization is avoided (thus saving money) and the danger of application instability resulting from storage saturation is avoided (thus avoiding alienating your customers):
Figure 2: SAN storage allows you to manage storage effectively to ensure you are well-utilizing your storage space at all times.
And by ensuring higher storage utilization levels, consolidating storage can also save your company significant money. For although the initial up-front cost of SAN array hardware and switching fabric is higher, because SAN storage can be more effectively managed to ensure higher utilization levels (around 75 to 80% is typical) than DAS (which typically averages around 50% because of the "just to be safe" syndrome) this means there comes a tradeoff when the total size of the storage (number of drives in the arrays) after which SAN storage starts to become cheaper on a per-byte basis than DAS storage:
Figure 3: At a certain point SAN becomes a cheaper storage solution than DAS.
Overprovisioning and data tiering
Migrating your server storage from DAS to SAN is not enough however to ensure effective utilization of storage. What you also need to do is to implement a data tiering strategy.
Data tiering is simply the practice of storing your business data on storage devices appropriate for each type of data. Recently acquired data (hot data) is valuable because if you lose it you might lose a sale or alienate a customer. As a result, hot data should always be stored on Tier 1 storage that is fast, highly reliable, always available and easy to search. It's also usually expensive since you typically use 15k SAS or SCSI hard disk drives (HDDs) or even solid state drives (SSDs) for such purposes, so you want to minimize the amount of data stored on Tier 1 storage.
By contrast, older data (stale or cool data) is less valuable and can therefore be moved to Tier 2 storage that is slower, has acceptable reliability, is not mirrored or replicated, and can be slower to search because it isn't needed often. Tier 2 storage is typically comprised of low-cost commodity-based large-capacity hard disk drives in either a SAN or NAS array.
Usually you'll also want to archive data that is no longer needed for day to day business operations. Such cold data is typically moved from Tier 2 storage to tape storage (Tier 3) and is kept only for archival purposes and regulatory compliance.
Figure 4: Characteristics of storage used for data tiering.
Over provisioning can be avoided through a combination of storage consolidation and data tiering. In the next article we'll examine some strategies for implementing data tiering.