Balance Me!

Thoughts on vSphere DRS and Data Center Load Balancing

I’m a fan of vSphere DRS (Distributed Resource Scheduler) for a number of reasons – it keeps an eyeball on VM entitlements, is very handy for moving around VMs during cluster patch remediation, and generally does a good job at keeping my workloads happy. However, I think there are a few areas where it could use a bit of sprucing up. Take this scenario in my lab, for example:

DRS Resource Distribution
DRS Resource Distribution

Host esx0 is running nothing (zero VMs) while esx1 and esx2 have about 10 VMs each. All of the VMs are getting the resources they are entitled to, and so this is considered a balanced cluster. If all I was concerned over was host utilization, I’d likely agree, but there are many other dependencies that span beyond my hosts that are not taken into consideration.

Later, I decide to power on a few more virtual machines.

  1. My Server 2012 template is powered on. DRS chooses esx0, which is running nothing. Good choice, DRS!
  2. I then power on my Server 2008 R2 template. DRS chooses host esx1, which is already running low on resources. Host esx0 is still nearly empty and is only running a single VM. Sad panda πŸ™
  3. Host esx1 is low on resources, so DRS moves my vCenter VM off host esx1 and onto host esx0.
DRS Being Silly
DRS Being Silly

I find the activities above to be inefficient.

Concerns Beyond Host Demands

A few data center concerns:

  • ToR / EoR switch port utilization.
  • Storage network saturation (SAN or IP Storage) across the upstream fabric.
  • Blade enclosure saturation to the upstream network aggregation layer (such as a Cisco IO “FEX” Module, Cisco Fabric Interconnect, or HP Flex-10 module).
  • Front end ports on a storage array, such as WWPNs for FC or LACP hash distribution across NAS interfaces.
  • Quantity of VMs I am comfortable with on a single host for potential failure scenarios.

There aren’t many ways to solve these concerns unless you turn to third party solutions, such as from VMTurbo, whom I wrote about in this post, Cirba, or the Proactive DRS fling. Or, perhaps something in PowerCLI?

Balancing by VM Utilization

For giggles, I wrote a rather simple script that balances a cluster based on memory utilization. I did this because I find the “empty host” scenario to be quite frequent in my lab and wanted to have a way to sort things out. I then tossed it over to vCO as a workflow so that I wouldn’t have to dig around for the script.

I was very happy to see that the Masters of DRS tossed over some very insightful and technical comments as well.

Duncan is right about DRS – it’s making sure that the VMs are receiving the resources they demand from the hosts they run upon, and thus doing what it was designed to do. It’s not meant to balance VMs by any other arbitrary value.

My concern here is that the rest of the data center is largely ignored. See my bullet list in the previous section, above. Note that I’m not even including other “soft” metrics in my list, such as licensing for guest workloads, or the ability to intelligently place VMs into distinct failure domains without the use of complex rules and groups. Are these other data center concerns not also important, beyond what the host is able to provide workloads that demand resources?

Frank is spot on in that simply balancing VMs based on memory utilization is a very brutish way to do things. It doesn’t take into account a number of metrics that would provide some smart placement suggestions. So, why did I choose memory utilization in my script? It was easy to slap together code that looks at memory utilization, and I’ve found that memory utilization is often the simplest way to eyeball vSphere cluster usage. That’s the perk of having a lab; I can choose to run arbitrary code to meet my specific needs. πŸ˜‰

Distributed Memory Fairness

Another way to peek into fairness is to look at a pair of QuickStats values (here’s the data object details) that expose a few fairness values. Specifically, DistributedCpuFairness and DistributedMemoryFairness, which are “represented in units with relative values, meaning they are evaluated relative to the scores of other hosts.

I’ve pulled them from host esx0 below, which have values of 10000 (divide that by 1000, which equals a value of 10) for both CPU and Memory. As a reminder, this is the host running zero VMs. A value of 1000 (which is actually 1, because 1000/1000 = 1) is considered fair; anything above or below that is not as fair.

ESX0 QuickStats
ESX0 QuickStats

The other hosts, esx1 and esx2, have memory fairness values of 243 (0.243) and 254 (0.254), respectively. Because 10 is much further away from 1 than .243 or .254 is, the host esx0 is much more unfair with its memory utilization than my other hosts are.

Interested in seeing your fairness values? You can pull these values out using a small script PowerCLI, such as this one, which populates an array with a series of hashtables.

$hostview = Get-VMHost -Location (Get-Cluster 'Lab') | Get-View
$hostall = @()
$hostview | % {
$hostinfo = @{}
$hostinfo.Add("Name",$_.Name)
$hostinfo.Add("MemUsage",$_.Summary.QuickStats.OverallMemoryUsage)
$hostinfo.Add("MemFairness",$_.Summary.QuickStats.DistributedMemoryFairness)
$hostinfo.Add("VMs",$_.Vm.Count)
$hostall += $hostinfo
}

If you want to hack into my source code for the Balance VMs by RAM script, it’s available on my GitHub page. It’s very rough and tough; it just finds the least loaded ESXi host and then shuffles the smallest VM over to it. More of a proof of the concept than anything else. πŸ™‚

Thoughts

Keep in mind that hosts only update their QuickStats every 20 seconds or so. It’s not realtime data. But, I think it may be of some use to help eyeball the balance of workloads across a cluster, and using memory tends to work fine in a small lab to keep my workloads evenly spread across hosts.

Balance Me!
Balance Me!

I think that DRS could stand a few more improvements that wouldn’t necessarily eat the lunch of third party solution providers. Namely around the rules and groups functionality, which is very rigid in design and operation, and in the ability to define cluster-level startup automation. Perhaps the DRS team could also add some more insight into what DRS is thinking and expose some of the reasons behind decisions to end users. I often find that DRS will show an imbalance for a cluster, refuse to move a workload around to fix the imbalance, while not giving any sort of output as to what is imbalanced and why no movement is taking place – this is frustrating.