I stumbled upon an interesting conflict between the reported MTU size on a distributed switch (VDS) in my Wahl Network lab environment. The particular switch in question controls all LAN related services, such as management, vMotion, and virtual machine traffic. Using the vSphere Client, I set the VDS to an MTU size of 9000 so that I could support the use of jumbo frames on specific vmkernel ports.
The goal was to test some vMotion speeds over 1GbE with jumbo frames and also to do some vCloud related adventures. However, each attempt to pass along jumbo frames on a specific host (ESX0) would be met with loss of connection.
Curious as to the cause of the traffic loss, I started to troubleshoot.
The first step was to verify that the MTU on VDS-LAN was in fact set to 9000. Looking at the properties of the VDS in the vSphere Client, it appears all was configured properly.
I then tried to do a vmkping with a jumbo frame on a subnet that had vmkernel ports on both side configured to MTU 9000. The correct command to do this from the ESXi Shell is:
vmkping -s 8972 -d 10.0.253.52
Note: The “-s 8972” is telling vmkping to send a frame with 8972 bytes in the payload, and the “-d” keeps the packet from fragmenting. The other 28 bytes is consumed by the ICMP header (8 bytes) and the IP Header (20 bytes). 8972 bytes payload + 8 bytes ICMP Header + 20 bytes IP Header = 9000 bytes.
I double checked that both sets of vmkernel ports were set to an MTU of 9000.
On a whim, I pulled up a list of vSwitch information. This is when I noticed that the ESXi Shell was reporting an MTU of 1500 for the VDS-LAN switch.
Kind of weird, eh?
Since jumbo frames were working on two of my lab hosts, ESX1 and ESX2, but not on ESX0, I figured there had to be something goofed up on ESX0. I removed the host from the VDS and used a temporary vSwitch, then rebooted the host for good measure. When it came back up, I attempted to migrate ESX0 back onto the VDS-LAN switch via the vSphere Web Client.
Interestingly enough, the vSphere Web Client was smart enough to catch the problem.
ESX0 has a Broadcom NetXtreme BCM5722 LOM (LAN on Motherboard) NIC, whereas all of the other hosts only have Intel NICs. I did some digging and found that the Broadcom card does not support the use of jumbo frames. I suppose that when the VDS tried to apply an MTU of 9000, the card failed and refused to make the change.
I removed the offending Broadcom NIC and applied the MTU change again with successful results.
Score one for the vSphere Web Client! 🙂