I received an email from a gentleman looking to learn more about Link Aggregation Control Protocol (LACP) in a home lab environment. This particular environment had some virtualized ESXi hosts mixed with a physical ESXi host, with a desire to use LACP on the physical host and not on the virtualized hosts. In short: can you mix LACP hosts with non-LACP hosts?
As a quick reminder, LACP requires use of the vSphere Distributed Switch (VDS). I’ve written about using LACP with ESXi 5.5 in the past, including a brief video. Because LACP Data Units (LACPDUs) must be sent and received to turn up the Link Aggregation Group (LAG), there must be at least one endpoint actively sending these data units, while the other endpoint can passively listen or also actively send. Otherwise, the LAG does not form.
Back to the question at hand. The challenge with mixing layer 2 aggregation methods within a Distributed Switch and Distributed Port Group is determining what it’s going to be used for. In most cases, I’d say this is a bad idea – homogeneity is the goal for any vSphere cluster. But host-defined logical objects – the VMkernel interfaces and Uplink-to-pNIC (physical NIC) assignments – can indeed have a unique underpinning. These do not span any further than a single host, allowing them to use unique Distributed Port Groups or pNICs.
As an example, let’s say that Host A is using a pair of 1 GbE pNICs for management traffic, while Host B is using a pair of 10 GbE pNICs for management traffic. These links are dedicated to management. It is then possible to create two unique port groups. Distributed Port Group Mgmt 1 is configured for LACP. Distributed Port Group Mgmt 2 is not.
This works because all of the logical objects are host-defined and will not migrate elsewhere. When Host A’s hypervisor needs to transmit frames for the management interface, it will examine the routing table, forward frames to the LAG, and the hashing algorithm will do the rest. Beyond that, the network is a black box. Host B’s hypervisor will do the same, except it will select a single pNIC to forward frames based on the teaming policy (such as Route Based on Virtual Port ID).
Virtual machine network adapters do not have such a luxury, because the underlying Distributed Port Group must be the same across all hosts in a vSphere Cluster. If you create a Distributed Port Group that expects LACP to be configured, then add a host that does not have LACP, the result is that the LAG will not form. This may end up working in some broken form – perhaps the frames will forward to a pNIC and you’re lucky enough that the return-path hashing algorithm picks the same pNIC, but it’s not a stable design.
For migration purposes, it is possible to create a secondary, unique Distributed Port Group. The migration would require coupling a vMotion with a virtual machine network port group move, assuming that the destination port group was on the same layer 2 network. Alternatively, I’ve documented how to move from a vSphere Standard Switch to a vSphere Distributed Switch, using LACP, in this post.
Although I’d advise just staying away from LACP all together.