The use of a Link Aggregation Group (LAG) with Link Aggregation Control Protocol (LACP) is rather standard with converged infrastructure northbound uplinks. This grants additional link redundancy and avoids the need for minor levels of interruption in the event of a single link failure, and when coupled with a virtual port channel (vPC) it can also provide protection against switch failure. However, I have found that the use of a LAG, often referred to as a port channel, can cause some confusion when configuring the vSphere switch side of the equation. Nearly all documentation in the wild focuses on the need to use an IP Hash teaming policy whenever a LAG is present.
Does this also mean that you have to use IP Hash for vSphere switches inside of converged infrastructure, such as Cisco UCS or HP Virtual Connect?
With traditional rack mount server design, a port channel is created between the upstream switch and the vSphere host itself. 2 or more NICs that live inside the hypervisor become member ports in the port channel. In this case, yes, you would need to use the IP Hash teaming policy. This is because the vSphere switch is responsible for handling the load distribution for the port channel.
Converged Infrastructure Differences
With most typical configurations of converged infrastructure, the need for IP Hash is eliminated. The blade switch or interconnect switch handles the northbound connectivity out of the system. Here’s a visual I’ve created for Cisco UCS. In this case, the LAG exists between the upstream switch (not shown) and the uplink ports on the Fabric Interconnects. LACP sends LACPDUs no further than the Fabric Interconnect to form the LAG (port channel). The underlying vNICs on the vSphere switch and hypervisor are unaware that they are ultimately using a LAG for northbound traffic forwarding. The vNICs are connected to a vEth port (in the case of Cisco UCS), which the Fabric Interconnects dynamically pin to the uplink or uplink port channel of their choosing.
The virtual switch (shown in green) can use whatever teaming policy that you wish to give it (although I don’t recommend IP Hash). If your virtual switch is a Distributed Switch (VDS) you can also choose to do Load Based Teaming (LBT) which is called the “route by physical NIC load” teaming policy.
It’s really up to you what makes the most sense based on your network topology – you may even want to pin traffic (here’s a vMotion example). The takeaway here is that you do not need to use IP Hash as the LAG does not touch the virtual switch.
Not All Created Equal
If you are using some sort of direct pass-thru device, which exposes the hypervisor directly to the northbound (upstream) networking infrastructure, you would then indeed need to use IP Hash. There is no “middle man” switch performing the dynamic pinning of vNICs from the host to the converged infrastructure switch. An example of this would be an Ethernet Pass-Thru Module for an HP c7000.
I definitely advise using LAGs on your converged infrastructure deployments for a large number of positive reasons: redundancy, throughput aggregation, better failure handling, etc. Brad Hedlund has some older, but still quite awesome, videos on Cisco UCS on his website here that go deeper into the product. If you have an HP Virtual Connect deployment, you can also find a pile of technical gold on VC 4.X here.