vSphere 5.5 Improvements Part 4 - Virtual SAN (VSAN)

Now that I’ve given you the scoop on all the fancy new vSphere storage improvements, it’s time to move on towards the Virtual SAN, commonly referred to as VSAN. To begin with, let’s talk about what the VSAN isn’t. It’s definitely not a repackaging of the whimpering flop that was the VSA (Virtual Storage Appliance), nor is it simply a rebranding of Virsto. VMware has based VSAN on to the idea of creating an abstracted pool of storage that looks like a single volume of storage but acts like a virtual hybrid-flash object storage array.

The idea is to populate somewhere between 3 and 8 hosts with at least one or more locally mounted SSDs and HDDs so as to create an aggregated datastore across your vSphere hosts. For now, you’re limited to 8 hosts that can participate in the creation of a VSAN, with more hosts on the roadmap for the future.

The neat thing about VSAN is that hosts without any local storage can still consume the VSAN, they just can’t participate in creating the VSAN. So you could theoretically have an 6 host cluster where only 5 of the hosts are contributing local disk to the VSAN, while the remaining host has no local storage at all.

There are, of course, some requirements to get this puppy up and running:

Either a dedicated 1 Gb NIC for VSAN traffic, or a shared 10 Gb NIC.
Replication traffic is handled by a new dedicated VSAN vmkernel port, much like you’d assign a vmkernel port to be for vMotion or Fault Tolerance.
The vSphere host’s SAS / SATA controller must support either passthrough or HBA mode – this is where it passes along visibility and control of the disks directly to the hypervisor, rather than masking them off as a logical volume.
At least one SSD and one HDD per host that will contribute to the VSAN, preferably the same mixture on each host (just like you should try to keep your vSphere hosts homogeneous within a cluster).

Contents

Creating a VSAN

There are two modes used to create a VSAN: Automatic or Manual. In Automatic mode, any empty (non-used) local disk will be claimed by the VSAN. In Manual mode, you pick the disks to add to the VSAN. This is pretty straight forward. You can then turn on the VSAN and perform configuration / monitoring in the vSphere Web Client. Every host that lives in the VSAN cluster has the ability to be a VASA provider, with one host taking on the responsibility at a time. If that host fails, another host becomes the active VASA provider.

It’s a checkbox! I like checkboxes. 🙂

Virtual Machine I/O and Availability

Writes to the VSAN are first handled by the SSD and then flushed down to the HDDs in a serialized manner. Reads from VSAN are pulled from the SSD cache, if available, or come from the HDD if there is a cache-miss from SSD. Thus, the SSD acts as a write-through cache and does not contribute towards the total VSAN capacity. This is also why I consider it to be a virtual hybrid-flash array – there is a combination of performance (SSD) and capacity (HDD) at work here, just at a distributed and logically virtual configuration.

VSAN doesn’t bother with a bunch of clunky RAID beyond striping (and only if necessary), and instead focuses on RAIN (Redundant Array of Independent Nodes). I say good riddance to RAID. Virtual machines can be striped for performance (the data is written to multiple HDDs) and protected by replicas (the data is written to multiple vSphere hosts in the VSAN).

The amount of stripes and replicas are controlled via a policy engine, called VM Storage Policies, and are compared against the VSAN capabilities to ensure that the policy can be met. By default, the VSAN works to protect all of the VMDKs by making sure there is a replica on another node. You could also create your own policies. For example, if you create a policy that requires 4 stripes of data, but there are only 2 HDDs in each VSAN Host, there’s no way that policy can be met. Here’s a brief list of the supported VSAN capabilities:

Number of disk stripes per object – the quantity of HDDs to stripe data across (for HDD performance).
Number of failures to tolerate – the quantity of hosts, network, and/or disk failures tolerated.
Object space reservation – the percentage of the logical size of a storage object (including snapshots) that should be reserved by thick provisioning.
Flash read cache reservation – the percentage of flash capacity reserved as read cache for the storage object.
Force provisioning – an override to force provisioning even if the policy requirements can not be satisfied by the VSAN.

Host Maintenance Mode

Assuming you’re happy with your VSAN and things are chugging away nicely, you’ll still have to perform host maintenance from time to time. At the very least, this means installing host patches and extensions or upgrading the ESXi 5 version code. There is a new addition to the Maintenance Mode confirmation box that asks how you wish to handle migration of VSAN data. There are three options available:

Ensure accessibility (default) – this will make sure that any VMs that would be made inaccessible by removing the host are relocated to another VSAN host node.
Full data migration – move all data from the VSAN host node to other nodes, which will be handy if you wish to decommission the host.
No data migration – no data is migrated, and if any VMs are only available on this VSAN host node, they will become inaccessible.

VSAN Support

So the sad news is that VSAN does not support those monster sized vSphere 5.5 62 TB VMDKs, vCloud, or Horizon View today. Sad panda! But, I would imagine that those looking to use any of those three features today probably have a more robust set of hardware to rely upon anyway.

Here’s the full cheat-sheet of supported features.