Testing vSphere NIOC Host Limits on Multi-NIC vMotion Traffic

VMware’s Network IO Control (NIOC) is a really neat technology in that it defines shares and limits to different types of network traffic. Rather than rely on isolation of physical adapters for traffic, as was common with rackmount servers with 6 or more NICs, NIOC allows one to more successfully place a wide variety of network traffic types on a single set of uplinks. The intelligence to understand which traffic rates higher priority is baked into the shares values.

The concept is very similar to the use of resource pools for compute and memory and indeed has much of the same terminology (the traffic protocols are even labeled as “network resource pools” in the GUI).

How Does NIOC Limit Multi-vMotion NIC Transfers?

Now to put some meat on the bone. It begs the question – in a scenario where NIOC is set to limit a Multi-NIC vMotion, how is vMotion affected? Is a NIOC limit for all of the vMotion traffic, or a per uplink limitation? Let’s find out using a few tests below.

Test #1 – vMotion Limited 100 Mbps

For this first test, I set the vMotion traffic limit in NIOC to 100 Mbps, which is about 10% of a single 1GbE uplink. I want to see how vMotion handles such a severe bandwidth limitation.

I then cracked open ESXTOP and migrated a virtual machine from ESX1 to ESX2. Interestingly enough, the vMotion activity picked a single uplink to send over the traffic as shown below.

Both vmk6 and vmk1 are enabled for vMotion

Additionally, when the virtual machine was migrated the other way the same behavior was observed from the receiving end (ESX2 to ESX1).

It appears that when the limit is this low, only one NIC is used.

Test #2 – vMotion Limited to 800 Mbps

For this next test, I increased the limit to 800 Mbps. This is a bit more realistic of a limit for a single 1GbE uplink.

The tests showed that the two vmkernel ports both sent about 800 Mbps of traffic during a vMotion task for a total of 1600 Mbps of throughput used on the pair of uplinks.

Notice that both vmk1 and vmk6 are sending 800 Mbps of traffic

This seems to show pretty clearly that the NIOC limit does indeed apply to each uplink individually, and not the vMotion traffic type as a whole.

Thoughts

I had incorrectly assumed that because the limit says “Host Limit (Mbps)” that the value chosen would limit the entirety of vMotion traffic (over all uplinks) to the value specified. The test above has proven me incorrect in this thought process. However, I will admit that setting values on a “per link” basis is much easier to design around. If additional links are brought into the team, no adjustments need to be reflexively made to the Host Limit values (a good thing!).

One concern I have is that NIOC is a source based control mechanism. In the case of a VMware host, this means that while the host can control the shares and limits of traffic generated, it is not able to control that of other hosts it is receiving traffic from. For 1 to 1 connections, where a single host is sending traffic to another single host, this typically isn’t much of an issue. However, in a many to 1 connection, things can get a little tricky. Take vMotion as an example. If multiple hosts are issuing a vMotion to a single destination host, each source may be limited in the amount of bandwidth it can send via NIOC, but the receiving host will be peppered with a high volume of vMotion traffic – this may end up squeezing out traffic of other types.

While I really don’t go deep into design limitations of this corner case here, it is something I wanted to bring up. Perhaps a future post on this?