One of the neat features with VMware vSphere DRS (Distributed Resource Scheduler) is that it can effectively defragment the workloads on your hosts to accommodate an HA failover event. If a failed host contained some very large VMs that need to be restarted, DRS can shuffle around smaller VMs to make room for them.
In this post, I’ll demonstrate a scenario that includes the use of DRS affinity rules and show how DRS takes care of moving around workloads to ensure that the rule is adhered to during an HA failover. There are many real world applications for this configuration, such as a cluster that is using DRS affinity rules as a constraint for soft partitioning, or a security mandate to keep specific workloads apart from each other.
Contents
The Lab Configuration
For this demonstration, I’ve configured a DRS “must” rule to keep Production VMs on Production Hosts. This means that my production VMs are required to run on any hosts in the Production Hosts group. If those hosts are not available, the VMs will not be powered on.
For this particular exercise I have:
- Production VMs: One VM named “vCenter Server Appliance” with 8 GB of RAM.
- Production Hosts: Two ESXi hosts named ESX1 and ESX2
- The VM is currently running on host ESX2
Taking a look at the hosts, its clear that if ESX2 goes down, ESX1 will not have enough available RAM to power it on.
Host ESX2 Failure
At this point I’m going to pull power to ESX2 to simulate a crashed host. A number of activities are going to take place:
- ESX2 will be marked as failed by vSphere HA.
- vSphere HA will attempt to restart the failed VM.
- The DRS affinity “must” rule requires that the VM be on ESX1.
- ESX1 does not have resources available to power on the failed VM.
- vSphere DRS will be invoked to migrate VMs off ESX1 to make room for the failed VM.
- vSphere HA verifies that ESX1 has the room necessary to restart the failed VM and powers it on.
Live Lab Video
The video below is a live lab example of the actions that occur.