Skip to content

NFS on vSphere Part 4 – Technical Deep Dive on Load Based Teaming

by Chris Wahl on Apr 30th, 2012 | 5,264 views
teeter-totter-small

In my past three posts, I go into some misconceptions on how NFS behaves on vSphere, along with a pair of deep dives on load balancing in both a single subnet and multiple subnet environment. If you’re just catching up on this series and are unfamiliar with how NFS works on vSphere, I recommend giving these articles a glance. The summation is that NFS requires multiple subnets in order to use multiple uplinks, with the exception of a situation where an EtherChannel is properly utilized on a single subnet. However, even with a static EtherChannel, multiple storage targets and planning for unique least significant bits are still required to actually utilize more than one uplink.

However, there is another option available to those with Enterprise Plus licensing: the “Route by physical NIC load” load balancing policy, otherwise known as load based teaming (LBT). This policy is only available on distributed switches (vDS) which cannot be created if you are not licensed for Enterprise Plus.

Load based teaming is a very powerful technology that monitors vmnics (uplinks) for saturation. When an vmnic reaches 75% utilization for 30 seconds, LBT tries to move workloads to other, non-saturated vmnics. It is my opinion that this was mostly created with the mindset of balancing VM traffic, but it also works well for vmknics carrying NFS traffic. In this post, I’ll go over how this process works, configuration, and a lab test.

Load Based Teaming

The concept of monitoring load and migrating traffic around is nothing new to the world of vSphere. VMware admins are constantly leveraging the ability to vMotion workloads around for maintenance and balance, along with tools such as the Distributed Resource Scheduler (DRS) to assist in automated workload distribution.

Some interesting things about load based teaming and how it works:

  • The power of load based teaming exists outside of the portgroup construct. Meaning, you don’t need all of your VMs or vmkernels to exist in a single portgroup to take advantage of load based teaming.
  • As long as “Route based on physical NIC load” is selected, any portgroup will proactively monitor the vmnic utilization in their team and shift workloads around, even if another portgroup is responsible for generating the load.
  • Ultimately, vmnic utilization triggers moving workloads.
  • Turning on LBT is non-invasive and does not impact the active workloads.
  • Only active vmnics are considered for movement. Any standby or unused vmnics are not targeted as destinations.
  • Saturated 100 MB links do not trigger LBT movement, and I tested this in the lab to confirm – though, is anyone seriously using 100 MB links on their vSphere host? :)

With that said, let’s cover the configuration of this lab environment to showcase the power of LBT.

Lab Configuration

This time around I’ve reconfigured the lab entirely. The NetApp simulators are incredibly sluggish to configure and test against, so I have switched over to Nexenta’s Community Edition on a virtual machine.

Below is the logical configuration. I’ve created a lab using a single NAS server (Nexenta CE) presenting 4 exports. All traffic is on VLAN 1, 2, 3, and 4 (which is 10.0.X.0/24 in my lab, where the VLAN number equals the third octet) to an ESXi host running 5.0 update 1 (build 623860). The host has 2 uplinks along with 4 vmkernels. In order to consistently create traffic, I have deployed 4 of the VMware IO analyzer appliances – one on each export. This allows me to quickly simulate VM traffic going to all of the exports at the same time.

Lab Screenshots

Rather than using a virtual host, I have rebuilt the lab network to work on on my “production” hosts and switches. This makes it much easier to generate enough traffic to trigger a LBT movement and eliminates the massive amount of duplicate frames received (as seen with virtual hosts on a promiscuous portgroup).

Additionally, the storage has been presented by the Nexenta over NFS. Under the hood are a pair of SSD drives, giving me plenty of IO for this test and the ability to simply mount the same datastore repeatedly using different VLANs.

Lab Test – Triggering Load Based Teaming

Let’s first take a look at the environment and identify the relationships between vmkernels and vmnics (uplinks). vmk1 and vmk4 have been put on vmnic3, while vmk2 and vmk3 are using vmnic0. This was decided by the hypervisor, I had no input in the matter. :)

Also, note that vmk0 (my management vmkernel) is using vmnic3 and is in an entirely different portgroup. I enabled LBT for that portgroup as well, to prove that LBT doesn’t care about portgroups as a delimiting factor.

Let’s see if we can generate a lot of traffic on vmnic3 and get the other guys to use vmnic0. I’ll fire up the IO Analyzer that is sitting on VLAN1 (vmk1) and see if we can get LBT to shuffle things around. Below is a screenshot showing the results, along with a zoomed image of the ESXTOP data.

The IO Analyzer saturated all of vmnic3, so LBT moved all other vmkernels over to vmnic0, even the management vmk0 on an entirely different portgroup. As you might imagine, this is a very powerful method for load balancing.

For the sake of fun, I’ll generate another big spike of load on the 3 vmkernels sitting on vmnic0 and watch LBT balance them. Below you can see vmk2, vmk3, and vmk4 kick off a large read spike that saturates all of vmnic0.

After trending the traffic for 30 seconds, LBT kicks in and migrates vmk2 and vmk3 to vmnic3. It’s somewhat difficult to balance 3 workloads that are going full speed on 2 uplinks, but LBT does a good job at trying.

Thoughts

It seems that load based teaming is a great way to address dynamic shifts in workload, and is relatively easy to set up. If you’re using Enterprise Plus licensing and are comfortable with distributed switches, this is probably the best way to go. Keep in mind, however, that you will need to oversubscribe your vmnics (uplinks) with a higher ratio of vmkernels. Otherwise, LBT will have nothing to balance. For example, if you had 2 vmkernels for 2 vmnics, each vmkernel has a dedicated uplink – there’s nothing it can move around.

I hope you’ve gained some valuable insight into the world of NFS on vSphere through my deep dive series, and no longer feel that the protocol is only suitable for ISO storage. ;)

Also, if you want the official VMware white paper on “vSphere on NFS” by Cormac Hogan, it was released in February of 2013.

NFS on vSphere – Deep Dive Series

The entire series of NFS on vSphere deep dives:

  1. NFS on vSphere – A Few Misconceptions
  2. NFS on vSphere – Technical Deep Dive on Same Subnet Storage Traffic
  3. NFS on vSphere – Technical Deep Dive on Multiple Subnet Storage Traffic
  4. NFS on vSphere – Technical Deep Dive on Load Based Teaming
18 Comments
  1. drechsau permalink - Apr 30th, 2012

    Wow, that was great, thank you!

  2. Adam B permalink - Apr 30th, 2012

    What changes when you use the 1000v DVS? I know LACP is supported on the 1000v, but does it allow us to aggregate bandwidth without having to create multiple IPs/Subnets/VKernels?

    • Chris permalink - May 1st, 2012

      Nothing really changes with the Nexus 1000V when compared to a static EtherChannel from a design perspective. LACP is simply a dynamically formed EtherChannel and will accomplish the same load distribution as static. The 1000V does allow for different distribution policies (instead of just src-dst-ip). You will still need unique least significant bits on multiple storage targets.

  3. nOon permalink - May 1st, 2012

    It’s nice to see another nfs fan for vmware infrastructure.
    I make nearly the same test 2 years ago and i think the same. the only way to make load balancing was to use multiple network and multiple mount point.
    And it’s a shame because one of the advantage of NFS is t have less datastore on our infrastructure.
    I just wait a pnfs implementation on vmware.

  4. Marcos permalink - May 25th, 2012

    Regarding “Turning on LBT is non-invasive and does not impact the active workloads.” I have to disagree,
    I do have NetApp storage as well, and a configuration pretty much the same as emplained here, and using a VMWare View instance to serve a few hundred Virtual Desktops, and we had constant problems of users being disconnected randomly, and the day that I removed the LBT feature was when it all began to work correctly.

    • Chris permalink - May 25th, 2012

      So you changed your teaming policy for handling NFS storage and it caused View sessions to disconnect? I’m having trouble understanding the relationship between a View session and the NFS storage traffic.

  5. Ben permalink - Nov 22nd, 2012

    Question : does that mean you have tog mount your nfs datastores x times in vmware? As seende in your screenshot….?

  6. DR3Z permalink - Jan 31st, 2013

    Chris,

    Great post! Can you explain the “unique least significant bits”. I’m not network tech my any means and am trying to understand. Can you provide an example?

    Thank you!!

    • Chris Wahl permalink - Jan 31st, 2013

      The least significant bit is the last number(s) in the binary address. In a port channel with 2 uplinks, it’s the very last binary number (0 or 1). With 4 uplinks, it’s the last two binary numbers (00, 01, 11, 10), and with 8 uplinks it’s the last three binary numbers (you get the idea). If you had two addresses that shared a common least significant bit, the port channel does not offer any benefits.

      • kjstech permalink - Feb 21st, 2013

        Wait, a little confused on the least signifigant bit…
        Your lab test shows 4 uplinks & subnets, and the NFS exports last bits are identical… 10.0.X.88 last octet 88 = 1011000, so last two bits is 00.
        Wouldn’t all exports on your shared storage need to have different last two bits so like this:
        10.0.1.88 (last two bits = 00)
        10.0.2.89 (last two bits = 01)
        10.0.3.90 (last two bits = 10)
        10.0.4.91 (last two bits = 11)

        Or are you saying they have to be identical like you have it (all ending in octet x.x.x.88 (last two bits always 00 in this case).

      • Chris Wahl permalink - Feb 21st, 2013

        The lab test I show here is not using an EtherChannel – I’m using unique subnets instead.

        If I was using just one subnet and an EtherChannel, then I would need 88, 89, 90, and 91.

      • kjstech permalink - Feb 21st, 2013

        Thanks Chris.
        We use VLAN10 (10.10.10.x) for NFS traffic. I guess I am looking at creating VLAN11 (10.10.11.x), VLAN12 (10.10.12.x) and VLAN13 (10.10.13.x) to duplicate your lab test.

        I use EMC Celerra NX4 and currently NFS exports are at 10.10.10.10 but it looks that I can create multiple interfaces on the same trunk out to our switch (802.1q for multiple vlans). I’ll do a little digging with the EMC forums to make sure, but I think it can accomplish what your Nexentra did.

  7. Adam permalink - Mar 25th, 2013

    Great post. I’m going to try and implement this setup with the new hardware I just got in.

    I’m setting up a 4 port NetApp filer and just wanted to confirm that each port on the filer would have an IP in a different subnet and also reside in it’s own VLAN with this setup? No vifs with alias’ would be setup on the NetApp because this uses LBT and not ip hash correct?

    Thanks.

  8. Steve L permalink - May 1st, 2013

    No real discussion to add, just kudos. Due to circumstances I have had limited NFS exposure in my career and this thread helped clear every last uncertainty I had in a very concise manner. Well done sir.

Trackbacks & Pingbacks

  1. NFS on vSphere – Technical Deep Dive on Same Subnet Storage Traffic « Wahl Network
  2. NFS on vSphere – Technical Deep Dive on Multiple Subnet Storage Traffic « Wahl Network
  3. NFS on vSphere – A Few Misconceptions « Wahl Network
  4. Nexenta storage for the vLab | Erik Bussink

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS