Potential vSphere Host Crash When Upgrading To vCenter 5

VMware gets a nod from me for providing such an aggressive number of updates and features in small development windows. I rarely encounter a bug that creates a show stopper. However, this latest issue with hosts crashing when upgrading to vCenter 5 is a bit of a worry, and I’m hoping to shine some light on it to prevent any others from encountering it. Kudos to Mike Laskowski who pointed out this issue to me to share with you. Also, Ron Singler over at the Ruptured Monkey shares another tale of woe.

The Dreaded PSOD

For those not aware, PSOD is the Purple Screen of Death, a cousin to the Blue Screen of Death (BSOD) seen in the world of Windows. It is the result of a vSphere host hitting a condition that causes a dump of crash details in a specific format, with a background being a pink/purple tinge.

A sample PSOD screenshot

Update (3/6/2012): vCenter 5 Causes PSOD with ESX / ESXi 4.1

In this specific scenario, it has been documented that hosts running ESX / ESXi 4.1 without any updates (Build 260247) can suffer a deadlock and PSOD when the controlling vCenter is upgraded from 4.1 to 5. There is already a KB (2009586) published that outlines the issue:

After upgrading from vCenter Server 4.1 to 5.0, the ESX/ESXi 4.1 host fails with the PSOD error: Spin count exceeded (VASpace) – possible deadlock with PCPU 2

A brief summation of the error in the PSOD is:

@BlueScreen: Spin count exceeded (VASpace) – possible deadlock with PCPU 2
2:19:58:08.201 cpu4:428426)Code start: 0x418039000000 VMK uptime: 2:19:58:08.201
2:19:58:08.209 cpu4:428426)Saved backtrace from: pcpu 2 SpinLock spin out NMI

The problem I have with this is that the compatibility matrix clearly shows ESX / ESXi 4.1 (Build 260247) being compatible with vCenter 5. I think there should be an asterisk linking to the KB, or just removing the green check mark for vCenter 5.0. Thanks to John Troyer for routing the details internally, the matrix has been updated!

The power of the Twitterverse!

This compatibility matrix is updated to show ESX / ESXi 4.1 not being compatible with vCenter 5.0

How To Determine What Update You Have

Use the host build number to determine your update level. I’ll show you two simple methods on my home lab which is running ESXi 5.0 build 469512. I often refer to this ESXi Wikipedia page to translate a build number into an update level (not the most high tech method, but it works and is easy to access).

The first method is by connecting to vCenter or your ESX / ESXi host with the vSphere Client. The build number is part of the host description above the summary tab.

The other method is to use the line command from the host DCUI or SSH session:

vmware -v

Which results in details like so:

Resolution

Ensure that your vSphere hosts are running a minimum of ESX / ESXi 4.1 update 1 before upgrading vCenter. Realistically, you should already be running 4.1 update 1 anyway (or perhaps even update 2), but I understand that isn’t always possible for all environments.

Update (3/4/12): vCenter 5 Not Compatible with ESXi 4.0 update 2

Sean Duffy points out in the comments section below that ESXi 4.0 update 2 (Build 261974) is also not compatible with vCenter 5, but that fact is reflected in the matrix. The KB (2007269) can be found here (thanks Rotem Agmon!):

ESXi 4.0 hosts may experience a purple screen after vCenter Server is upgraded to 5.0

Here is the matrix showing that 4.0 update 2 is not compatible with vCenter 5: