Resolving VLAN Tagging Errors on HP ProLiant Blades using Virtual Connect and ESXi 5

I ran across this curious bug that was a bit difficult to track down due to the nature of the errors that it displayed and wanted to share the troubleshooting path / resolution.

Details

I was working on a BladeSystem (C7000) populated with ProLiant BL460c G7 blades along with HP Virtual Connect.The HP Virtual Connect Ethernet networks were configured to tunnel VLAN traffic (pass the VLAN tags along without stripping them off) so that the distributed switches in vSphere could handle the VLAN tags. Installation of ESXi 5 on the blades went just fine. I used the “test network” option in the DCUI successfully: the blades pinged their DNS servers and resolved their DNS name without a hiccup. However, I could not attach them to vCenter, nor could I use the vSphere Client to connect to them directly.

The error I received was:

The server ‘name.of.server’ could not interpret the client’s request (the remote server returned an error: (404) not found)
Error stack:
Call “ServiceInstance.RetrieveContent” for object “ServiceInstance” on Server “address.of.server” failed.

Googling this error brings about a lot of incorrect resolutions, such as adding https to the front of the name of the server, or ensuring that the resolv.conf file contained valid DNS entires. It was obvious that DNS was working due to the fact that I could indeed ping by using FQDN values on the console of the blade.

At this point I was a little stumped. I tried a different approach. Using ethtool in the DCUI, I grabbed the firmware and driver versions and began looking around for KB articles discussing any bugs. The shipping driver version is 4.0.88.0. I managed to find this Advisory published by HP and updated on January 19th (yesterday).

Advisory: (Revision) HP BladeSystem – Emulex be2net Inbox Driver Version 4.0.88.0 Does Not Support Flex-10 or Flex Fabric Adapters on VMware ESXi 5.0 in a Virtual Connect Environment

It specifically calls out the issue:

If a VLAN tag is added to a network (for example, from the VM Portgroup, the network in VC, the external switch, etc.), then the vSphere client cannot connect to the host.

Ah ha! That was exactly the case.

Resolution

The resolution is pretty straight forward – update to the latest drivers.

The async be2net driver version 4.0.355.1 is available on the VMware website at the following URL:

http://downloads.vmware.com/d/details/dt_esxi50_emulex_be2net_403551/dHRAYnQqQHBiZHAlJQ

My method was to do EST (External Switch Tagging) on the Virtual Connect side, get the hosts connected in to vCenter, and push out the updates using VMware Update Manager (VUM). Once updated, the blades could then again use VST (Virtual Switch Tagging) without any issues. Because vSphere 5 has multi-host remediation with VUM, it doesn’t take long to patch a bunch of blades, as I set it up to just do all of them simultaneously. :)

I hope this helps anyone else Google searching for the answer!