NFS is my favorite way to attach storage to a vSphere host, but also one of the more annoying protocols to try and design for when shooting for high availability and load balancing. It also seems to be a very misunderstood protocol with a number of common “urban legend” like misconceptions floating around. Unlike fiber channel and iSCSI, which both have ways to “understand” the various paths to a device, NFS is a bit more, well … dumb. Or rather, I should say NFS version 3 is “dumb” as a lot of the wish list features, such as session trunking (multipathing), are included in the NFS version 4 specification.
Mini Rant: So, why aren’t we using NFS v4? Good question – VMware doesn’t support it. It’s not an option for implementation. So, even if your storage array has the option for NFS v4 (such as NetApp), your vSphere hosts won’t take advantage of the goodness. NFS v4 isn’t really the point of this post, but it’s good to know that the folks over at the IETF have been working hard to take the protocol further.
In this post, I’m going to expand a bit on the thoughts I had back when I wrote my original “A Look At NFS on VMware” post with some additional musings based on misconceptions that I have seen repeated in regards to VMkernel selection, uplink selection, and load balancing.
If you’re looking to do high availability and load balancing of virtual machine networking, I suggest heading over to this excellent post entitled “Etherchannel and IP Hash or Load Based Teaming?” written by Michael Webster.
How vSphere Routes NFS Traffic
Presenting NFS storage to vSphere is a subject area that is riddled with false information. Based on my responses to various posts on the VMware Technical Network, there are a few common misconceptions that I see repeated.
Misconception: “NFS Traffic can be bound to a VMkernel just like iSCSI” or “NFS Traffic requires a VMkernel with Management Traffic enabled”
False! With iSCSI, you have the ability to bind both a VMkernel port and uplink to the vmhba acting as an iSCSi Software Adapter. This effectively encourages the vSphere host to use a specific set of VMkernel adapters for iSCSI traffic. There is no method to do this with NFS.
Your choices are vMotion, FT, management, or iSCSi binding. Nothing for NFS.
The result of this is that NFS traffic will traverse out of an uplink based on the following criteria:
- The host attempts to find a VMkernel port that is on the same subnet as the NFS server. If multiple exist, it uses the first one found.
- If no VMkernel ports are on the same subnet, NFS traffic traverses the management VMkernel by way of the default gateway.
Typically, item #2 is not desired and should be avoided.
Misconception: “NFS load balances across all uplinks in a team”
Absolutely not, this requires a special configuration. Once an appropriate VMkernel port has been found (above section), the next step for the vSphere host is to determine which uplink to utilize. Regardless of the vSwitch type, a vSphere host using a default portgroup configuration (route based on virtual port ID) with multiple active uplinks in a team will only use a single uplink to an NFS server’s mount point. Teaming uplinks together only offers high availability, not load balancing.
In a 10 gigabit environment, load balancing is usually not going to be that big of a deal. Realistically, for most environments even 1 gigabit is plenty.
But if you are looking to load balance, you have a couple of options.
Load Balancing NFS Traffic
Misconception: “Just turn on EtherChannel and NFS will load balance”
Yes and no, it’s a bit more complex than that. There are only two viable ways to attempt load balancing NFS traffic in my mind. The decision boils down to your network switching infrastructure, the skill level of the person doing the networking, and your knowledge of the traffic patterns of virtual machines.
- Using an EtherChannel, IP Hash teaming policy, and vmkernel IP(s) with unique least significant bit(s). If using a single vmkernel, you need storage that can be assigned multiple IPs or virtual interfaces (VIFs).
- Using a vSphere Distributed Switch (requires Enterprise Plus) with Load Based Teaming enabled along with multiple VLANs (subnets) and storage that can be assigned multiple IPs or virtual interfaces (VIFs) to cover each subnet.
The least significant bit(s) portion of Option #1 is extremely important, as ignoring this part completely eliminates any possibility of load balancing. Going deep on configuring these methods are out of scope for this post, although I could possibly follow up with the steps required to configure each in the future.
Hopefully this clears up a handful of common misconceptions. I look forward to when we don’t have to jump through these hoops to mount NFS exports that are properly load balanced. 🙂