It’s interesting how often I hear people refer to LACP as if it were some sort of magical unicorn with a universal solvent for aggregating traffic across a number of uplinks. I believe this is stemmed from a base misunderstanding of how EtherChannels work, as I hear lots of confusion and FUD around it. This post is an attempt to take a stab at educating folks on just what the heck LACP is, why it doesn’t work for any native vSphere switches, and the very few advantages it actually brings when compared to a static EtherChannel.
So, What is LACP?
LACP, otherwise known as IEEE 802.1ax Link Aggregation Control Protocol, is simply a way to dynamically build an EtherChannel. Essentially, the “active” end of the LACP group sends out special frames advertising the ability and desire to form an EtherChannel. It’s possible, and quite common, that both ends are set to an “active” state (versus a passive state). Additionally, LACP only supports full duplex links (which isn’t a concern for gigabit or faster links). Once these frames are exchanged, and if the ports on both side agree that they support the requirements, LACP will form an EtherChannel.
Note: LACP has a number of other pieces when setting up an EtherChannel, such as calculating system and port priority, configuring an administrative key, and so on. This really isn’t relevant to what you, the VMware admin cares about, but it does exist if you want to go learn the nitty gritty.
LACP Is Not Included in Native vSphere Switches
Simply put, the native vSphere switches are not able to respond to LACP frames. It neither listens nor advertises. If you set up LACP on the remote switch, it will never hear a reply back from the vSphere host, and thus an EtherChannel will not be created.
If you want to form an EtherChannel with a vSphere host, you must create a Static EtherChannel. This is also referred to as being “on” as the commands to set up Static in Cisco is “mode on”. When set to Static, there is no discovery or advertisements – the EtherChannel is immediately created by the physical switch.
How About Load Distribution?
Here’s the kicker:
Both Static and Dynamic (LACP) EtherChannel use the same load distribution methods.
I put that in italics and bold to emphasize the point. Yes, it’s true. Static and LACP use the same load balancing techniques to handle traffic. If anyone tells you otherwise, bet against them for an easy way to make some quick cash.
This could be you!
But Static Requires IP Hash and LACP Doesn’t … Right?
Now we’re at the murky part. The answer is “wrong” but let’s go into the “why it’s wrong” part.
IP Hash is a native vSphere switch requirement. It does not support any other load distribution methods.
Taken from the vSphere Networking guide
Here is the warning from vSphere:
Notice that if I want to use EtherChannel, I have to select IP Hash for my load balancing method, and immediately there is a little information box warning.
Note: The term IP Hash equates to a load distribution policy of src-dst-ip on a Cisco switch.
Static EtherChannel under other circumstances can use any of the available load distribution policies. When talking to a vSphere host, however, it is forced to using an IP Hash because that is what vSphere is able to use.
If you want to use LACP with vSphere, you are required to install a Cisco Nexus 1000V virtual switch. There’s no other way to do LACP with vSphere at the time of this writing. And, since the 1000V is a (nearly) full featured Cisco switch, instead of a native vSphere switch, you can use any load distribution policy you want – you are no longer limited to IP Hash (src-dst-ip)
LACP Advantages over Static
LACP does have a few tricks up it’s sleeve, but none relating to getting traffic from source to destination any more efficiently.
If you add more than the supported number of ports to an LACP port channel, it has the ability to place these extra ports into a hot-standby mode. If a failure occurs on an active port, the hot-standby port can replace it.
However, the typical supported number of ports for LACP is 8, so for a vSphere environment this is not a feature we would care about. Realistically, who is making 8 port EtherChannels to a vSphere host?
If there is a dumb device sitting in between the two end points of an EtherChannel, such as a media converter, and a single link fails, LACP will adapt by no longer sending traffic down this dead link. Static doesn’t monitor this. This is not typically the case for most vSphere environments I’ve seen, but it may be of an advantage in some scenarios.
LACP won’t form if there is an issue with either end or a problem with configuration. This helps ensure things are working properly. Static will form without any verification, so you have to make sure things are good to go.
Bottom line: I’m not saying LACP (or the Nexus 1000V) is bad. It’s a very popular protocol that I see used all the time. My issue is that I see people wanting LACP thinking that it will better balance traffic, or do something else that it’s absolutely not going to do. Don’t go out and pop a bunch of Cisco 1000Vs into your environment for using LACP unless you have a solid use case.