Back to basics (Part 3): Virtualization 101: A Storage Primer (Cont.)

If you would like to read the other parts in this article series please go to:

Introduction

Data is the lifeblood of an organization. Customer information, sales information and payroll are just part of what makes an organization tick. As such, the storage upon which this data resides is an integral part in any organization’s data center. When it comes to managing your virtual environment, you need to make sure that you have enough storage capacity – which was discussed in the previous part of this series – as well as having enough storage performance so that the entire solution can keep up with workload needs.

Connectivity methods

Let’s start with a discussion about the various methods by which you can connect your virtual hosts to storage in the environment.

Direct-attached storage

When you think of direct-attached storage, think of hard drives that are internal to the server or that connect directly to the server via some kind of direct interface, such as Serial Attached SCSI (SAS), Serial ATA (SATA) or eSATA. SAS disks connect to an interface that operates from 3Gbps to 6Gbps while SATA communications channels run at 1.5 Gbps, 3 Gbps or 6 Gbps. Although the disk to system transmission speed of SAS vs. SATA disks is similar on the surface, SAS is more efficient and SAS disks often operate at much higher rotational speeds (7.2K RPM, 10K RPM and 15K RPM) than SATA which tops out at 10K RPM for even the best disk.

I should mention that it’s important to consider the storage to host communications channel separately from the disk type. Why? Because you can, for example, use SATA disks in an iSCSI storage array.

Although direct-attached storage can provide very good performance when the right disks are used, direct-attached storage currently has some limitations that either prevent or make more difficult the implementation of some advanced features of hypervisors. For example, with vSphere, in order to use high availability mechanisms such as vMotion and the Distributed Resource Scheduler, you need to have shared storage. By its very nature, direct-attached storage is not shared storage in this context. That is, direct-attached storage is dedicated to the connected host and is not shared with multiple host systems.

Some companies have released solutions that enable these high-end features even with shared storage in use. However, these solutions are generally targeted at small and medium sized businesses that may not have, want or need a complex shared storage environment, but that still want to be able to take advantage of high availability features.

iSCSI

Back in the days before SAS and SATA, servers used SCSI disks. SCSI disks have their own command set that enables the storage devices to carry out their duties. Today, the SCSI command set lives on in both direct-attached (SAS) and shared storage systems (iSCSI). iSCSI is a shared storage network that relies on TCP/IP for storage communication to take place. This effectively moves storage traffic to Ethernet with the storage traffic encapsulated inside IP packets. This double encapsulation – SCSI commands into TCP/IP and TCP/IP into Ethernet frames – does add some overhead to iSCSI that isn’t present in other connectivity methods, but also bestows upon iSCSI all of the benefits of TCP/IP, including a well-established routing mechanism.

iSCSI storage arrays connect to the network at speeds of 1 Gbps or 10 Gbps. At the host/hypervisor side, specific physical Ethernet adapters are dedicated to iSCSI storage responsibilities. The best part about iSCSI: Organizations can rely on their existing and well-understood network infrastructures to create a powerful shared storage environment.

Fibre Channel

When it comes to enterprise shared storage, Fibre Channel has long held a perch atop the performance mantle with the fastest transport available as well as the fastest disks available.Fibre Channel connectivity comes at a number of speeds of 1, 2, 4, 8, 10 and even 16 Gbpsvarieties.

Today, a newer and simpler Fibre Channel option is coming to market as well. Fibre Channel over Ethernet (FCoE) is a way to encapsulate Fibre Channel frames and place them into an Ethernet packet directly. FCoE is intended to help organizations lower their costs and simplify their cabling infrastructure will still providing high-speed storage networking options.

Fibre Channel’s main challenges over the years have been its cost and complexity. The technology has been more expensive than competing solutions and it has required a specialized skill set.

The truth

I’m not going to say that your transport choice doesn’t matter because it does, but probably not as much as is often made to be seen as the truth. For small and medium environments and even for some large environments, 1 Gbps iSCSI is more than suitable but with the availability of 10 Gbps iSCSI, there are even more options.

I’m not going to necessarily advocate a specific option. That said, even under heavy load, I’ve not seen the storage transport mechanism be the primary cause of performance issues in a heavily virtualized environment.

While transport choice is an important consideration, it’s more difficult to quantify than other performance factors.

RAID levels revisited

Before I move on to other performance factors, let’s review the chart below, which outlines some of the various RAID levels that you have may have at your disposal. In Part 1 of this series, we focused on the capacity question, which required you to focus on the “Overhead” column in the table. In this part, we’re going to focus on performance, quantified in the “Write impact” column.

Raid level

Protection

Tolerance

Min

disks

Overhead

Write

impact

RAID 0

None

0 disks

2

None

None

RAID 1

Very good

1 disk

2

50%

2x

RAID 5

Good

1 disk

3

1/n disks

4x

RAID 6

Excellent

2 disks

4

2/n disks

6x

RAID 10

Excellent

1/2 of disks*

4

50%

2x

RAID 50

Good

1 per set

6

1/n disks * RAID 5 sets

4x

RAID 60

Excellent

2 per set

8

1/n disks * RAID 6 sets

6x

Table 1

Answering the performance question

In Part 1, I ended the article by telling you that, at first glance, it might seem that RAID 6 is your best choice as it offers the most redundancy and protection without that much in the way of overhead. However, capacity is just one storage metric. The second one, which is just as important, is storage performance. From this standpoint, RAID and its variants, such as RAID 60, are the worst choices. You see, when you’re using RAID 6/60, each and every time you write data to the storage array, six separate I/Ooperations are required. This is why it says that there is a 6x write performance in the Write impact column of the table.

You’ll note that, with the exception of RAID 0, all RAID levels impose some kind of write penalty. RAID 0 doesn’t have a write penalty, but it also provides no protection whatsoever, so it’s not a good choice for most people.

Basic metric

When you’re talking about performance in this way, the general metric is an IOPS (I/O per second). The more IOPS you have, the more data you can transfer back and forth. This is a metric separate from the transport.

An IOP isn’t as dependent on the underlying disk technology as it is on a disk’s rotational speed. The faster a disk spins, the more data that can be read from it faster. Here’s a very rough estimate of IOPS based on rotational speed (might be on the low side, but are good enough for discussion).

7200 RPM: 75 IOPS

10K RPM: 125 IOPS

15K RPM: 175 IOPS

Now, let’s take a look at what happens when you start applying an application’s needs to storage. Let’s use Exchange as an example since it’s easy to grasp… the concepts are transferrable to virtualization. Suppose your large Exchange environment requires 5,000 read IOPS and 2,000 write IOPS based on the values that you put into the Exchange Mailbox Calculator tool. With some simple math, you would discover than your read IOPS are easy to determine:

7200 RPM: 67 disks

10K RPM: 40 disks

15K RPM: 29 disks

As you can see, the rotational speed has a dramtic impact on the number of disks required for a particular implementation.

Now, let’s look at the write side. Before you can do the same math, you first have to factor in the RAID I/O impact. This is achieved through multiplication. Let’s suppose that you’ve chosen a RAID 10 level and will use 15K RPM disks.

First, multiply the 2,000 IOPS need by the RAID penalty of 2x for RAID 10 to get to 4,000 IOPS, then divide this by 175 IOPS. The result is 23 15K RPM disks are necessary to get to 2,000 IOPS with RAID 10.

Again, this is very, very simplistic, but is intended to help you think through the various variables that must be considered when it comes to storage in your virtual environment.

One time when the IOPS game will raise its head in a big way comes when you think about VDI. Consider this: At 8AM, everyone comes to the office. If every employee has a virtual desktop, you could have hundreds of people booting their virtual machines all at the same time. This is an I/O intensive process that can result in what is called a “boot storm”. Thisis basically a situation in which the storage could be overwhelmed. These boot storms must be planned for. In these situations, consider the use of solid state disks with very high IOPS counts (well into the thousands per disk) and an architecture that targets these SSDs at the boot process in order to counter the boot storm effect.

Summary

This three part series is intended to help you think about the storage needs you may have in your virtual environment to assist you in understanding how the architectural choices that you make can have a major impact on the capacity and performance of your virtual machines.

If you would like to read the other parts in this article series please go to:

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top