NFS is my favorite way to attach storage to a vSphere host, but also one of the more annoying protocols to try and design for when shooting for high availability and load balancing. It also seems to be a very misunderstood protocol with a number of common “urban legend” like misconceptions floating around. Unlike fiber channel and iSCSI, which both have ways to “understand” the various paths to a device, NFS is a bit more, well … dumb. Or rather, I should say NFS version 3 is “dumb” as a lot of the wish list features, such as session trunking (multipathing), are included in the NFS version 4 specification.
Mini Rant: So, why aren’t we using NFS v4? Good question – VMware doesn’t support it. It’s not an option for implementation. So, even if your storage array has the option for NFS v4 (such as NetApp), your vSphere hosts won’t take advantage of the goodness. NFS v4 isn’t really the point of this post, but it’s good to know that the folks over at the IETF have been working hard to take the protocol further.
In this post, I’m going to expand a bit on the thoughts I had back when I wrote my original “A Look At NFS on VMware” post with some additional musings based on misconceptions that I have seen repeated in regards to VMkernel selection, uplink selection, and load balancing.
If you’re looking to do high availability and load balancing of virtual machine networking, I suggest heading over to this excellent post entitled “Etherchannel and IP Hash or Load Based Teaming?” written by Michael Webster.
How vSphere Routes NFS Traffic
Presenting NFS storage to vSphere is a subject area that is riddled with false information. Based on my responses to various posts on the VMware Technical Network, there are a few common misconceptions that I see repeated.

VMkernel Selection
Misconception: “NFS Traffic can be bound to a VMkernel just like iSCSI” or “NFS Traffic requires a VMkernel with Management Traffic enabled”
False! With iSCSI, you have the ability to bind both a VMkernel port and uplink to the vmhba acting as an iSCSi Software Adapter. This effectively encourages the vSphere host to use a specific set of VMkernel adapters for iSCSI traffic. There is no method to do this with NFS.
Your choices are vMotion, FT, management, or iSCSi binding. Nothing for NFS.
The result of this is that NFS traffic will traverse out of an uplink based on the following criteria:
- The host attempts to find a VMkernel port that is on the same subnet as the NFS server. If multiple exist, it uses the first one found.
- If no VMkernel ports are on the same subnet, NFS traffic traverses the management VMkernel by way of the default gateway.
Typically, item #2 is not desired and should be avoided.
Uplink Selection
Misconception: “NFS load balances across all uplinks in a team”
Absolutely not, this requires a special configuration. Once an appropriate VMkernel port has been found (above section), the next step for the vSphere host is to determine which uplink to utilize. Regardless of the vSwitch type, a vSphere host using a default portgroup configuration (route based on virtual port ID) with multiple active uplinks in a team will only use a single uplink to an NFS server’s mount point. Teaming uplinks together only offers high availability, not load balancing.
In a 10 gigabit environment, load balancing is usually not going to be that big of a deal. Realistically, for most environments even 1 gigabit is plenty.
But if you are looking to load balance, you have a couple of options.
Load Balancing NFS Traffic

Misconception: “Just turn on EtherChannel and NFS will load balance”
Yes and no, it’s a bit more complex than that. There are only two viable ways to attempt load balancing NFS traffic in my mind. The decision boils down to your network switching infrastructure, the skill level of the person doing the networking, and your knowledge of the traffic patterns of virtual machines.
- Using an EtherChannel, IP Hash teaming policy, and vmkernel IP(s) with unique least significant bit(s). If using a single vmkernel, you need storage that can be assigned multiple IPs or virtual interfaces (VIFs).
- Using a vSphere Distributed Switch (requires Enterprise Plus) with Load Based Teaming enabled along with multiple VLANs (subnets) and storage that can be assigned multiple IPs or virtual interfaces (VIFs) to cover each subnet.
The least significant bit(s) portion of Option #1 is extremely important, as ignoring this part completely eliminates any possibility of load balancing. Going deep on configuring these methods are out of scope for this post, although I could possibly follow up with the steps required to configure each in the future.
Thoughts
Hopefully this clears up a handful of common misconceptions. I look forward to when we don’t have to jump through these hoops to mount NFS exports that are properly load balanced. 🙂
NFS on vSphere – Deep Dive Series
The entire series of NFS on vSphere deep dives:


Great information, Chris, I’m a huge fan of NFS.
Just one thing, when you say “will only use a single uplink to an NFS server”, I was under the impression that this should maybe be ” will only use a single uplink to a NFS mount point (datastore) ”
Each datastore uses a separate vmkernel connection so if you have multiple datastores they may load balance across multiple uplinks in a vmkernel port group depending on how it selects which uplink to use. Unfortunately you don’t have control over which uplinks each datastore traffic goes but at least you may have some semblence of using multiple links with multiple datastores.
Interesting, and I agree that each datastore has the capability of using a different vmkernel. I used the phrase “NFS server” as that’s the entry field for the target when creating a new datastore (trying to not confuse folks who are walking through the GUI), but I think your description of mount point / datastore is a clearer way to express this … Thanks! 🙂
Unfortunately, the behavior you describe is not what I’m seeing in the lab – I’ll have to try to reproduce that. In a scenario where I had 2 vmkernels, both on the same subnet, my lab consistently showed traffic across the vmkernel that appeared first in the routing list. The second vmkernel received absolutely no traffic. Perhaps my sample was not large enough (4 NFS datastores).
Good article Chris. Where are your NFS datastores mounted, all off the same IP or spread across NFS servers? I’ve always assumed Julian’s definition is actually working as described – now I’m going to have to check my setups to see how well the traffic is really balancing.
I only have a single NFS server available by one IP in each subnet. For the “same subnet test” I created 2 vmkernels in VLAN 10, mounted 4 exports to the single target IP also in VLAN 10, and saw all traffic going through 1 vmkernel. I like your idea – I’ll try making 2 IPs on the NFS server and mounting to the different IPs and see if things change. Since the default teaming style is “Route by virtual port ID” I didn’t see a reason why a varied target IP would influence routing, but it’s worth a shot! Looks like I have some follow up work this weekend 🙂
Of much greater concern to me is not the load balancing of the network but load balancing of the array itself. e.g. in the NetApp world at least with “7 mode” you either have to run active/passive or manually load balance your volumes between the controllers. Seems 8.1 cluster mode just came out and I left a metric ton of questions on a NetApp blog a few minutes ago asking more about it. vs a true active-active cluster where the data is available from all controllers, all ports without any sort of LUN trespassing or whatever.
As you say – frequently the network is not the bottleneck, especially in this day and age with 10GbE being so affordable. Most VM workloads (I’d say it’s safe to say the vast majority) are random I/O, and your going to blow out your controllers or disks or whatever long before you blow out a fast network.
NetApp is definitely a different beast entirely, and almost all of my NFS experience has been working with a FAS NetApp system (3140, 3270, 6210). I look forward to the whitepaper coming out on NetApp’s Cluster Mode with vSphere to see what recommendations and designs practices they suggest using.
As far as head room, with svMotion I don’t think it’s a make or break having an active / passive system, as you can shift workloads to balance them a little better across the filers. Not saying that isn’t annoying at times, but it’s not a show stopper. SDRS could be your co-pilot on that 🙂
I’ve never been too fussed about load balancing the datastores across the Netapp nodes as you need to do this from a controller performance and capacity point of view anyway, regardless of the NFS constraints imposed by vSphere. Like you I’m curious to see what Cluster mode brings to the table although at first glance it’s going to require some serious network infrastructure to be effective.
Ed.
ps. I think I’m stalking you across the internet – I’ve just been reading your comments on @that1guynick’s blogpost!
I’ve been looking at this issue recently as we’re still on GB networks (not 10GB) and the single connection may be a bottleneck for some of our bigger Oracle databases. This is a good summary – bookmarked!
@Nate – funny, my previous stop was @that1guynick’s blogpost about ONTAP 8.1 and your comment. I’m stalking you around the interweb!
Quick question on the use of etherchannels: If I have four 1GB ports on the VMKernel port in an etherchannel to a cisco switch with IP hash load balancing on, are you saying that the NFS traffic won’t get an effective 4GB bandwidth to the storage? I’m a bit confused as to the need of multiple VLANs/IPs needed from your teaming point above.
Adam – my description in the post was a crappy copy paste job from the Load Based Team segment, I’ve updated it to clarify. Thanks for catching it. I apparently need more coffee. 🙂
Potentially with EtherChannel there simply needs to be multiple IPs on either side (or both sides) for load balancing to occur. A single vmkernel to a single storage IP will not load balance. You can choose to make multiple vmkernels or have multiple storage interface IPs (or both) but need to make sure their LSBs are unique. I would suggest also doing the XOR calculation to guarantee the uplinks are being balanced.
Chris, are you saying that using an etherchannel in VMware *does not* aggregate the links/bandwidth? Is this only for Kernel ports?
EtherChannel uses a Load Distribution policy to determine which uplink will pass traffic, it doesn’t just spray the traffic across all the links. VMware specifically requires use of the src-dst-ip policy, which examines the LSB(s) of the IP addresses to pick an uplink.
Really good post. There are 2 types of Protocols.
1) Those that require considerable architectural thought up front but require fewer day to day management tasks once the architecture is in place and are easier/simpler to manage,
2) The reverse of #1
NFS falls in the #1 category. FC/FCoE and even iSCSI in #2.
I seem to remember reading somewhere that vmkernel load balancing doesn’t take into account “Route by virtual port ID”. This is only load balancing for VM traffic, vmkernel does it’s own thing. Can’t seem to find the original article but I’ll keep digging…
I just finished a load of testing on vmkernel selection for NFS datastores. Once I compile all the data and get screenshots of the lab, I’ll get a post up. The short version is that the same vmkernel was chosen for all scenarios and in each tests.
[…] http://it20.info/2012/04/vmware-wants-to-be-the-vmware-of-networking/Clarifying NFS on vSphere http://wahlnetwork.com/2012/04/19/nfs-on-vsphere-a-few-misconceptions/More vSphere 5 memory management: http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdfSome […]
[…] Storage Traffic Apr 23, 2012 Building on the previous post I created to reveal some misconceptions of how NFS traffic is routed on vSphere, this article will be a technical deep dive on same subnet storage traffic. The information […]
[…] Deep Dive on Multiple Subnet Storage Traffic Apr 27, 2012 Now that I’ve gone over misconceptions of NFS on vSphere, as well as a deep dive on same subnet storage traffic, the next discussion will be around […]
[…] Deep Dive on Load Based Teaming Apr 30, 2012 In my past three posts, I go into some misconceptions on how NFS behaves on vSphere, along with a pair of deep dives on load balancing in both a single subnet and multiple subnet […]
[…] Misconceptions on how NFS behaves on vSphere by Chris Wahl […]
[…] Wahl has a good series on NFS with VMware vSphere. You can catch the start of the series here. One comment on the testing he performs in the “Same Subnet” article: if I’m not […]
[…] per this blog post, VMware will preferentially route NFS traffic down a kernel port that is on the same subnet as the […]
Thanks for the post.
This is the first spot I could find to tell me about VMkernel Selection, and what NFS would use (I’m very new to VMware and from a networking background).
After reading your post here I can now confidently create my UCS server profiles.
” If no VMkernel ports are on the same subnet, NFS traffic traverses the management VMkernel by way of the default gateway.
Typically, item #2 is not desired and should be avoided.”
Could you shed some light on this? Why does it need to be on the same subnet? I am curious because my VMkernel is not on the same subnet as the NFS datastore.
I advise reading the NFS on vSphere – Technical Deep Dive on Same Subnet Storage Traffic post, which contains details on this.
http://wahlnetwork.com/2012/04/23/nfs-on-vsphere-technical-deep-dive-on-same-subnet-storage-traffic/
[…] be speaking on Monday at 11:15 pacific on NFS with vSphere. Also, check out the complete schedule as they have lined up a ton of great speakers and […]
[…] be speaking on Monday at 11:15 pacific on NFS with vSphere. Also, check out the complete schedule as they have lined up a ton of great speakers and […]
I stumbled on your post and found it to be really a Godsend. I need some advise here since NFS is new to me. We have 6NICS and I want to dedicate 2 to NFS storage, 2 to VM Traffic and 2 to mgmt, VMotion. I have read that we need to create VMkernel port groups for NFS and have the IPs in the same subnet as the storage Device or NFS Datastore? My switches can’t do etherchannel.
In order to use both uplinks without EtherChannel you’ll need two subnets. Make sure to create two vmkernel ports, one on each subnet, and storage capable of having virtual interfaces on the two subnets.
Refer to my “Pinning Traffic” section here – http://wahlnetwork.com/2012/04/27/nfs-on-vsphere-technical-deep-dive-on-multiple-subnet-storage-traffic/
Great post Chris. Do you recommend or is it best practice to use dvswitch for an NFS isolated storage traffic with LBT policy instead of ip hash on dvswitch with EtherChannel? Beside does LBT on dvswitch supports Etherchannel?
I have always used vswitch for NFS storage traffic with ip hash policy (Etherchannel) and will like to move this to dvswitch for better throughput on networking as it has load balancing technologies and bandwidth control.
I tend to favor LBT over Etherchannel as it is aware of the actual load on the uplink. Both technologies require single:multiple IP targets (such as one vmkernel to multiple VIFs), but LBT can also work with multiple subnets over the same uplinks (such as a portgroup for VLAN A and another for VLAN B).
[…] Chris Wahl post on NFS misconceptions – http://wahlnetwork.com/2012/04/19/nfs-on-vsphere-a-few-misconceptions/ […]
[…] the NFS traffic. If you want to know how NFS traffic is handled by ESX, I suggest reading the posts on Chris Wahl his blog, he did a perfect job on explaining […]
[…] NFS on vSphere Part 1 – A Few … – NFS is my favorite way to attach storage to a vSphere host, but also one of the more annoying protocols to try and design for when shooting for high … […]
[…] NFS on vSphere Part 1 – A Few … – NFS is my favorite way to attach storage to a vSphere host, but also one of the more annoying protocols to try and design for when shooting for high … […]
Hi I wrote algorithm which should simulate ONTAP behavior with LACP load balancing & IP.
Hope this will be helpful for you
https://github.com/qdrddr/ontap-lacp
ONTAP 8 7-m & cDOT as well as ONTAP 9 uses same load balancing algorithm called SuperFastHash.