iSCSI – does bandwidth matter?
Over the past year, we've made massive changes to our data center infrastructure, moving about 75% of our formerly physical servers to a new set of four ESX hosts and combining to these four ESX hosts what used to be two standalone ESX hosts each with their own local storage. The now existing environment has, as I mentioned, four ESX hosts all sharing an EMC AX4 SAN with three trays of disks – two trays of SAS disks and one tray of SATA disks – after all, sometimes, you just need raw capacity.
I've read a lot of articles where the primary iSCSI to Fibre Channel concern revolves around the speed of the link. And, for certain kinds of workloads or for much larger workloads that we'll generate at Westminster College, that's certainly true. However, for a huge number of organizations, the iSCSI bandwidth argument just doesn't hold water.
We're not all that different from a lot of other organizations. We run Exchange, SQL Server, SharePoint, file servers, application server and more. We're running just under fifty virtual machines on our four ESX hosts with a few other physical machines also sharing the same AX4 for now. Although we're actively working to eliminate these remaining physical boxes, there will still be a workload requirement as we move to virtual machines.
In the screen below, you will see a couple of hours of performance information as relayed by EMC's Analyzer tool. In it, you will see, in MB/second, the level of throughput being seen by both of the storage processors in the AX4. Although the clock starts at after 5PM, the end of our workday, as a college, we have things going on 24/7, so there is always a level of activity. In this shot, note that, at 6PM, we saw a huge spike to almost 96 MB/s (768 Mb/s) but that normal utilization stayed at less than 8 MB/s, or 64 Mb/s (One Megabyte/second [MB/s] = 8 megabits per second [Mb/s]).
Figure 1: A bandwidth snapshot
Bear in mind that these are 1 Gb/second links (128 MB/s). With the exception of that 6 PM spike, the iSCSI bandwidth utilization is more than acceptable. And, about that spike: Today, we moved a number of backup services to a new backup server and scheduled the data move to begin at 6 PM, hence the massive spike.
In another posting, I'll break down the read vs. write bandwidth utilization and show you some "during the day" stats.
My mantra: It's all about the IOPS! If I can get the IOPS I need on an iSCSI link, I'll do it every time. It's less expensive, easier to support and quickly growing in use.