Building on the previous post I created to reveal some misconceptions of how NFS traffic is routed on vSphere, this article will be a technical deep dive on same subnet storage traffic. The information presented here is mainly focused on how NFS traffic behaves, but can also apply to iSCSI traffic when vmkernel bindings are not created. If you are using iSCSI with vmkernel port bindings, subnet rules do not apply to storage traffic.
To demonstrate the way that vSphere routes NFS traffic on a single subnet, I’ve created a lab using 2 NFS servers (NetApp Simulators) presenting 2 exports each, for a total of 4 exports. All traffic is on VLAN 5 (which is 10.0.5.0/24 in my lab) to an ESXi host running 5.0 update 1 (build 623860). The host has 2 uplinks along with 4 vmkernels, giving the NFS traffic a lot of options. In order to consistently create traffic, I have deployed 4 of the VMware IO analyzer appliances – one on each export. This allows me to quickly simulate VM traffic going to all of the exports at the same time.
The logical diagram is below. Note that vmk10 has the lowest IP (10.0.5.3) but the highest vmkernel number:
A look at the vDS showing the 4 vmkernels mapped to 2 uplinks. All vmkernels are in the VLAN 5 subnet. No other vmkernel is on VLAN 5; it is a completely isolated subnet.
Next, a picture of the NFS datastores. Note that each datastore is mapped on VLAN 5 to the pair of NetApp Simulators. Mapping was done using a simple PowerShell script.
Lab Test #1 – NFS Traffic Simulation
For the first test, I simply crank up the IO Analyzers and watch ESXTOP. All analyzers were powered on at the same time and storage was presented just prior to loading the analyzers, giving the datastores an opportunity to choose any of the 4 vmkernels on VLAN 5.
The following screenshot shows the configuration of IO Analyzer for my 4 appliances. The IP addresses for each appliance reflect the management IP address and has no relationship to the subnet used for NFS storage.
All NFS traffic chose vmk7, which is using vmnic6. The receive numbers tend to confuse ESXTOP because a virtual ESXi server requires a promiscuous port to operate, and so there are many duplicate frames being received on the other uplinks. Regardless, vmk7 is clearly transmitting the read requests (12,711.04 in this photo).
Lab Test #1 – Conclusions
This test verifies a number of things.
- vmk7 is the lowest vmkernel number and first vmkernel available in the team (vmk8, vmk9, and vmk10 are the other members).
- vmk10 has the lowest IP (10.0.5.3), debunking any thoughts that the IP address is relevant to uplink selection. The host doesn’t seem to care about IP address for selection.
- NFS Datastores do not look at “randomly” picking a vmkernel when they are created, but instead seem to pick the first vmkernel on the list. This doesn’t necessarily mean the lowest number, but typically when creating vmkernels they follow an incremental number pattern (7, then 8, then 9, and so on).
For this test, I am going to run the same IO Analyzer simulation, but then remove the active vmkernel (vmk7), let traffic migrate to another vmkernel, and add vmk7 back into the team. This test will verify if vmk7 is actively sought out, or if a migration of the datastores to another vmkernel is permanent until another failure.
First, traffic is simulated and vmkernel 7 (vmk7) is removed.
The host moves traffic over to vmk8. Note below that vmk7 no longer exists.
I then re-added vmk7 back into the team.
Traffic remains on vmk8. I’ve highlighted vmk7 and vmk8 below.
I’m not going to fill this post with pictures, but I also removed vmk8 and saw traffic move to vmk9. Removing vmk9 moved traffic to vmk10, and then removing vmk10 put the traffic back on the original vmk7.
Lab Test #2 – Conclusions
From this test, a few other conclusions can be made:
- The next vmkernel port on the list is chosen in the event of vmkernel removal. This test reinforces the idea that the list order is how selection is determined, not the vmkernel number. Notice that vmk7 appears at the bottom of the list after being removed and added, but was not chosen again until all other vmkernels were removed.
- Adding a vmkernel with a lower number to the list does not seem to influence the next vmkernel choice for NFS traffic.
Additionally, I unplugged the uplink and saw that the vmkernel simply moved over to the other available uplink.
I hope this clears up some confusion on same subnet vmkernel selection for NFS storage (and unbound iSCSI storage). The main takeaway here is that using a single subnet has no chance of load balancing NFS storage traffic in a default configuration.
If you’re using EtherChannel on a single subnet, the best you can hope for is a single vmkernel on the host and multiple IPs on the storage target, in which the switch’s Load Distribution policy will do an IP hash. But, you will need a different target for each uplink with a unique least significant bit (or bits, depending on uplink count). This means that if you have 2 uplinks, you’ll need 2 storage targets. Subsequently, 4 uplinks will require 4 storage targets. Using multiple vmkernel ports on the same subnet will have no influence on your EtherChannel Load Distribution because the host will only pick a single vmkernel to route traffic!