If you would like to read the other parts in this article series please go to:
- Hyper-V optimization tips (Part 1): Disk caching
- Hyper-V optimization tips (Part 2): Storage bottlenecks
- Hyper-V optimization tips (Part 3): Storage queue depth
- Hyper-V optimization tips (Part 4): Clustered SQL Server workloads
- Hyper-V optimization tips (Part 5): Power management
- Hyper-V optimization tips (Part 6): Network performance - DCB
In the previous article of this series we began examining problems relating to the monitoring of network performance for Hyper-V hosts and host clusters in enterprise environments. We began in that article by reviewing some basic networking concepts that are relevant to enterprise-level Hyper-V hosts. These basic concepts include Quality of service (QoS), Data Center Bridging (DCB), Offloaded Data Transfer (ODX), Windows NIC Teaming, and converged network adapters (CNAs). We then began outlining some of the difficulties that may be involved in monitoring the networking performance of Hyper-V hosts and host clusters by describing how the built-in Performance Monitor tool (Perfmon.exe) in Windows Server 2012 is unable to monitor the network traffic of the different traffic classes involved when you are using third-party DCB-capable Ethernet network adapters and DCB-capable switches and have the Windows Server 2012 Data Center Bridging feature installed on your hosts. In this present article we'll dig a bit deeper into this topic by seeing how the use of converged network adapters (CNA cards or CNAs) can further complicate the monitoring of Hyper-V networking performance.
Understanding converged networking
Converged networking refers to where different kinds of network traffic share the same physical networking infrastructure. One scenario where converged networking can be implemented is with Hyper-V host clusters which typically utilize several different types of network traffic. For example, a Windows Server 2012 R2 Hyper-V host cluster could generate the following kinds of network traffic:
• Management traffic between the Hyper-V hosts and your systems management infrastructure
• Workload traffic between external clients and virtual machines (VMs) running on the host cluster
• Storage traffic between the hosts, VMs, and storage
• Cluster traffic for communications between cluster nodes and CSV shared storage
In addition, the following types of traffic may also sometimes be present:
• Live migration traffic from VM live migration
• Replica traffic from VM replication using Hyper-V Replica
While most of the above forms of traffic are TCP/IP-based traffic, storage traffic may be either file-based (using SMB 3.0 over TCP/IP) or block-based (Fibre Channel or iSCSI). If your storage area network (SAN) uses Fibre Channel (FC), connectivity with the host can be provided using a Host Bus Adapter (HBA). But by using Fibre Channel over Ethernet (FCoE) you can encapsulate FC frames for transmission directly over Ethernet.
To achieve such convergence (combination) of TCP/IP traffic and encapsulated FC traffic and transmit both types of traffic over the same Ethernet network, a converged network adapter (CNA) is required. CNAs are NICs that combine the functionality of a TCP/IP Ethernet NIC and a FC or iSCSI HBA into a single card by utilizing storage offloading to allow your Hyper-V hosts to connect to both your Ethernet-based LANs and your FC- or iSCSI-based SAN. CNAs are typically 10 Gbps cards and are commonly used in blade servers found in datacenter environments because of the limited physical room for peripherals available in such blade servers. Popular of vendors of CNAs include QLogic and Broadcom.
CNAs, QoS and Hyper-V network performance
A fundamental problem with CNAs is that they communicate their aggregate network bandwidth to the operating system. Thus, if you have a CNA card in a Hyper-V host blade system and try to monitor network throughput, the CNA card will typically look like an ordinary 10 GbE data NIC to the Windows Server operating system. So even though some of the card's bandwidth is being siphoned off for transmitting FC or iSCSI storage traffic, as far as Windows Server is concerned it's simply transmitting standard Ethernet traffic.
The problem is that CNA cards are generally designed to work in conjunction with Data Center Bridging (DCB) to provide lossless communications for Fibre Channel over Ethernet (FCoE). And we've already see that you should not enable DCB on any NIC that is bound to the virtual switch on a Hyper-V host. Basically, if you have a NIC that is bound to your Hyper-V virtual switch then that NIC should not have any other source of traffic flowing through it, otherwise any bandwidth management you are trying to perform (including implementing network QoS) will generally break down because Hyper-V will try to manage the data path through the switch as if Hyper-V owns all of the traffic, and the result can be degraded network performance for your host cluster.
In addition, if you are trying to implement network QoS at either the Hyper-V virtual switch level or the virtual machine level, you should not be using any third-party networking hardware that partitions network traffic or tries to apply its own QoS to the traffic. And if you are using any networking hardware that multiplexes data path traffic with storage offload traffic then you won't be able to effectively use network QoS with Hyper-V.
There is a workaround to this problem however, and this is to use static QoS. In other words, you could use the vendor-supplied software included with your CNA card to specify how much bandwidth should be available for data traffic and how much for storage traffic. As an example, you might configure static QoS on the 10 GbE CNA card on your Hyper-V host so that 3 GB is dedicated for FCoE usage while the remaining 7 GB is exposed to the Windows Server operating system as being available for other network data usage (i.e. management traffic, virtual machine traffic, and so on). Many common CNA cards available today allow you to partition the adapter in the BIOS so that it will display a specified available bandwidth instead of its maximum available bandwidth.
Conclusion and recommendations
The situation with regard to this topic is quite complex and still evolving. Part of the problem is that Microsoft doesn't seem to be working that closely on this matter with the most commonly adopted solutions found in the real world of datacenter networking, namely those of Cicso, Dell, IBM and HP. Instead, Microsoft seems to be pushing their next-generation software-defined storage (SDS) solution called Storage Spaces Direct in Windows Server 2016 which uses Remote Direct Memory Access (RDMA) and which utilize RDMA-capable NICs from Chelsio and Mellanox as described in this blog post on Microsoft TechNet.
Unfortunately that's not the large majority of the real world is using in their datacenter environments. In other words, when you talk with major SAN storage vendors about network performance and availability, you often get quite a different perspective than you do when talking with Microsoft about what their own vision and plans are for the "converged infrastructure" of tomorrow's datacenters.
And because technology seems to evolve much faster than documentation of best practices for its use, recommendations issued several years ago with regard to using CNAs in Hyper-V host clusters may have led some customers to implement CNA-based solutions in blade environments even though this is not really the preferred solution. So if you are currently using CNA adapters in your blade-based Hyper-V host clusters, you might want to re-examine the network performance of your cluster under heavy load to see whether a static assignment of bandwidth might not offer improved performance over your current setup.
Got more questions about Hyper-V?
If you have any questions about Microsoft's Hyper-V virtualization platform, the best place to ask them is the Hyper-V forum on Microsoft TechNet. If you don't get help that you need from there, you can try sending your question to us at [email protected] so we can publish it in the Ask Our Readers section of our weekly newsletter WServerNews and we'll see whether any of the almost 100,000 IT pro subscribers of our newsletter may have any suggestions for you.