Avoid LACP with iSCSI Port Binding or Multi-NIC vMotion

A while back, I wrote two posts that discuss Link Aggregation Groups (LAGs) from a positive and negative angle. The idea is that LAGs offer some benefits and drawbacks, and the ultimate decision to choose one over the other should be driven by requirements for the environment. A few folks have asked some questions around mixing LAGs with various vSphere binding techniques. Specifically, iSCSI Port Binding and Multi-NIC vMotion.

I think it’s important to point out that LAGs and binding techniques are both looking to control and distribute traffic but in very different ways. It would make no logical sense to combine them, and many bad things can (and most likely will) occur if you do. Also, it’s not supported.

LAG Control with Hashing Algorithms

The idea behind load distribution, which is what a LAG provides, is to examine the traffic flows to determine which uplink to choose. This has traditionally been the source and destination IP address within the realm of vSphere, but has opened up to a number of layer 2 – 4 options with the vSphere Distributed Switch (VDS) version 5.5. Regardless, the traffic information is still assessed and a resulting uplink is chosen, as shown below:

The LAG hash determines which uplinks is chosen
The LAG hash determines which uplinks is chosen

The upstream switch can also send traffic to the host across whichever physical interface the hashing algorithm it has chosen.

Uplink Association using Port Binding

Conversely, port binding forms a 1:1 relationship between a logical entity, such as a VMkernel port, and the physical uplink. The traffic is not inspected and no choices are made dynamically – you configure the relationship once and it is respected until you make a change.

Below is an example. Two VMkernel ports, vmk1 and vmk2, are attached to their own Port Groups named PG1 and PG2, respectively. PG1 is configured to Actively use Uplink1 and not Uplink2. PG2 is configured the opposite way. Thus, vmk1 can only use Uplink1, and vmk2 can only use Uplink2. If a failure occurs on either Uplink1 or Uplink2, the associated Port Group and VMkernel port are also rendered offline.

A port binding between the VMkernel port and physical Uplink
A port binding between the VMkernel port and physical Uplink

The idea behind binding is to create logical fabrics. iSCSI now knows it has two distinct paths out of the hypervisor by way of vmk1 and vmk2. Depending on what multipath IO (MPIO) protocol was chosen, iSCSI might favor one over the other, or use both at the same time. But we leave those decisions up to the MPIO engine, not the network itself.

Hybrid Shenanigans

Let’s imagine you somehow circumvented all of the logical checks that prevent you from combining port binding with a LAG. Based on the traffic headers, the LAG may decide to place traffic from both VMkernel ports onto Uplink2, leaving Uplink1 all by his lonesome self. That’s because a LAG acts like a logical interface, and thus the VMkernel ports would both think they are talking to a single logical port.

Port Binding + LAG = A very sub-optimal network design
Port Binding + LAG = A very sub-optimal network design

And as you can see, this combination defeats the purpose of binding and adds two control mechanisms to the traffic flow. Note that we’re only talking about the host interfaces (uplinks) in this scenario, not the storage facing ports. Those can, and often are, configured as a LAG to distribute a many:few quantity of sessions (many hosts to few storage ports).

Thoughts

A similar situation is encountered when doing this with Multi-NIC vMotion, except that configuration uses Active / Standby for the Port Groups instead of Active / Unused. Still, we don’t want a LAG getting in the way of discrete uplink selection that is driven by the hypervisor.

Additionally, think about the traffic flows coming into the hypervisor. There is an expectation that a particular uplink will be used to both send and receive traffic. If, for example, an iSCSI bound VMkernel port is waiting for traffic on Uplink1 that is received by Uplink2, and it is unaware of the LAG, even more issues will crop up. My advice is to simply pick one and stick with it. 🙂