27 Responses

  1. Craig
    Craig at |

    Great article and I completely agree with you. I keep having our networks team try and jam LAGs down my throat but as soon as I explain LBT to them (and I’ve had this conversation with the same people many times), they scratch their heads and walk away.

    Looking forward to getting a copy of your book as well.

    Reply
  2. Craig
    Craig at |

    Great write up Chris.

    Strangely enough we had a use case for a LAG with storage this week. A customer had a NetApp and wanted it to present both iSCSI and NFS. NetApp storage best practices showed that LAG was the recommended route.

    Apart from this fringe use case, not seen anything else in the wild.

    Reply
  3. Michael Webster (@vcdxnz001)
    Michael Webster (@vcdxnz001) at |

    Great article, totally agree. Except if you don’t have the ability to run LBT or LACP and you’re using vSphere Standard Switches. The actual benefit of LAG’s for vSphere Hosts is very limited and more complex and problematic than it’s worth. I think this echo’s pretty well what I wrote in http://longwhiteclouds.com/2012/04/10/etherchannel-and-ip-hash-or-load-based-teaming/. I look forward to reading your next instalment.

    Reply
  4. Bolshakov Stas (@st_bolshakov)
    Bolshakov Stas (@st_bolshakov) at |

    Hey Chris,

    We’ve run in a trouble when our vsphere administrator just plugged host to the switch without configuring link agregation (passive LAG back in 5.1 times). There were a lot of mac flapping errors and CPU load of the phy switch went skyhigh.

    Reply
  5. Kale Blankenship (@vCabbage)
    Kale Blankenship (@vCabbage) at |

    Great article Chris!

    When using LAG on standard vSwitches (ie, static non-LACP lag) I completely agree with you. Static LAG is an invitation for misconfiguration and in most cases only provides aggregation. It has no more failover benefit than any other uplink method. If the link is up but there is some other issue (damaged cable, negotiating wrong speed, etc) traffic will continue to be forwarded.

    On the other hand you do get some benefit from LACP because there is bidirectional communication between the host and the switch. If there is a misconfiguration the LAG will either not form or the misconfigured ports will not be joined.

    When it comes to storage networks NFS is the only place I would consider using LAGs. With block protocols MPIO is always the better choice. Unless VMware adds pNFS support the only decent way to load balance NFS is to use LAG and connect to multiple datastores with different IP addresses. Still, I avoid this unless the environment truly justifies the added complexity.

    Reply
  6. Kale Blankenship
    Kale Blankenship at |

    I believe the reasoning is to provide load balancing without needing to have multiple NFS subnets and create multiple vmknics for NFS on each host. A secondary alias IP is assigned to the storage controller and datastores attach alternately to each IP. (ie, datastore1 connected via IP1, datastore2 connected via IP2, and so on) Source/destination IP hashing handles the load balancing.

    The alternative, if you don’t have stacking/MLAG, is to use virtual port ID with the vSwitch. In this scenario two vmknics are conifgured for NFS traffic, each on different subnets (one preferring link 1, the other preferring link 2). A subinterface is created for each NFS VLAN on the storage controller side and datastores are mounted similarly to the stacking/MLAG scenario. This ends up resembling the SAN A/B topology of FC except that the vmknic can failover if there’s link failure on it’s primary port.

    Apologies if my explanation was convoluted, running on little sleep at the moment.

    Reply
  7. Ron
    Ron at |

    Can’t wait for TRILL or SPB become more available vs STP

    Reply
  8. jason
    jason at |

    I can’t help but disagree on this, coming from the network guy perspective. You say you want simplicity? Well, then using LACP is the way forward. It has nothing to do with being loop-free. It has everything to do with simplicity.

    Say what? Suppose you’ve got 4 uplink ports coming off your vSwitch. With LACP, there is only *one* config on the physical network side – on the LAG. Without LACP, there are now 4 configs on the physical side.

    Nobody wants to manage 4 distinct port configs. Nobody wants 4x the configuration on the physical side. Nobody wants accidental inconsistency on the physical side because someone forgot to trunk a given VLAN on one of those 4 ports. Such inconsistency is 100% impossible with LAGs, since you’re configuring that VLAN in one place, and of the child interfaces inherit the config.

    A LAG from the vSwitch to the physical keeps the topology & config simple.

    Reply
  9. Visual Guide to LAG Thinking for Server Admins — EtherealMind

    […] vSphere Does Not Need LAG Bandaids – The Network Does while Chris Wahl talks about the server side for VMware but I wanted to add something to the […]

  10. Revenge of the LAG: Networks in Paradise | Wahl Network

    […] already squashed the loop avoidance myth in the post entitled vSphere Does Not Need LAG Bandaids and point out some advantages with Load Based Teaming (LBT). However, I neglect to cover some of […]

  11. Revenge of the LAG: Networks in Paradise

    […] already squashed the loop avoidance myth in the post entitled vSphere Does Not Need LAG Bandaids and point out some advantages with Load Based Teaming (LBT). However, I neglect to cover some of […]

  12. Daniel Massameno
    Daniel Massameno at |

    As our protocols get smarter, separate links seems like the more robust and flexible way to do it. For example, four 10GE links could be 4 x 10GE or with LACP it looks like a single 40GE.

    With 4x10GE an application like Multipath iSCSI I/O can take advantage of the four-link topology. It can decide three links are fully loaded and use the forth link for a new flow. It could dedicate one high-priority flow to one link and put everyone else on the other three. Some of this is hypothetical but you get my point.

    With the 1 x 40GE setup the application can’t do anything smart. There are not four separate links, just one 40 Gbps link. Nothing to do here, no intelligence can be applied. The LAG hashing algorithm could be making incredibly poor decisions on which link to place each flow, but unfortunately there’s nothing the application can do about it.

    Most of our protocols are not smart enough to take advantage of this feature. I’m hoping this will happen in the near future.

    Reply
  13. LAG vs LBT for vSwitch Uplinks | VirtAdmin

    […] of all, Chris Wahl already wrote about this quite nicely a while ago. If I recall correctly, it also comes up in Networking for VMware […]

  14. Justin Hamrick
    Justin Hamrick at |

    I think you would want to use a LAG on a standard vSwitch. LBT is only on Distributed vSwitch. Is there a better method for a standard vSwitch?

    Reply
  15. rbeuke
    rbeuke at |

    Chris

    I am a fan of your posts and have a question hopefully you can help me with!!

    We are using NFS (AFA) in a new High Performing SQL environment. These SQL servers will have a very low consolidation ratio either 1 to 1 maybe up to say 3 or 4 to 1. Currently they are only mostly 1:1

    This environment can use quite a lot of bandwidth during normal activities but really jump up during nightly backups. It came from a physical setup backed by an FC SAN and 10 GB links. Due to this we wanted to get more available bandwidth utilization and started looking into LACP and LBT. Our storage admins don’t really want to start creating multiple VLANS on different subnets with multiple data stores as its more overhead for them.

    Our current setup is as follows

    – 2x 1 GB links in standard switch for MGMT
    – 2x 10 GB links in LAG (Src and Dest IP Hash) on the host and a port channel across two Nexus 6k Switches for vMotion and LAN each on separate VLANS (vmnic 5 and 6)
    – 2x 10 GB links in LAG (Src and Dest IP Hash) on the host and a port channel across two Nexus 6k Switches for – Storage on its own VLAN (vmnic 4 and 7)
    – VMK1 for VMotion on the vMotion VLAN
    – VMK2 for Storage on the Storage VLAN

    In testing with two or 3 different VM’s running on a single host. Two have their DB drives on one datastore and the other has its datastore on the other datastore on the subnet.

    I have been unable to get LACP to work with two different datastores/ exports on the same VLAN. Each time I simulate traffic on the box it always sends out over the same nic in the LAG every time. I would expect it to sometimes choose the second nic?

    Is this because I don’t have enough datastores for the algorithm to start sending traffic to one nic or the other or is can something else be going on?

    I did also try and create a separate Datastore on a separate subnet and put another vmk on that subnet on the host. Then added another port group on the host and associated that storage lag with that as well, when this is done it will send over the second nic but this seems to defeat the purpose and at that point LBT may be better anyways.

    Any insight would be helpful.

    Reply

Share your point of view!