In Which I Muse Over VMware’s Virtual SAN Architecture

Back at the VMworld 2013 conference in San Francisco I took part in an “Ask the Expert VCDX” panel allowing participants to pepper myself and four other VCDX holders with questions. And while I do see the humor in the redundancy of the word expert (the “X” in VCDX is short for expert), there was nothing funny about the awesome questions we received. Once of which was about Virtual SAN – or VSAN as it is called – and where it stood to play in the data center space. Or, in fact, if it was more of an SMB / SME play to fulfill smaller, niche deployments.

It’s difficult to compound my thoughts on that question in the 60 second time frame I had back then, and so I wanted to expand my thoughts into a blog post. I encourage you to share your thoughts on the matter, as I am not trying to be “right” but rather share my personal ideas.

What’s In a Name?

To clear the air, I really dislike the name “Virtual SAN” for many reasons. Both the name “Virtual SAN” and the acronym VSAN are already taken and quite well known for anyone who works with Cisco SAN equipment (and most likely those who do not). We now have a Cisco VSAN as a way to logically carve up a SAN fabric and a VMware VSAN that provides server-side storage. Confusion is bad, and now all the search queries I use on Google to reference the old VSAN is clobbered with page upon page of VMware VSAN links. (I have the same beef with VMware SRM and EMC SRM).

The term “SAN” is often abused to mean “the storage array” – so much in fact that I tend to keep my trap shut when used that way. But a SAN is a Storage Area Network that provides SCSI access to a storage array. It is not the storage array itself.

VMware’s Virtual SAN is a redundant array of storage nodes housed inside of the compute layer, where the network is mainly there to replicate bits of data from one node to another. The whole point is to provide local access to meaningful data for the workload running on the server as often as possible. The workload should have a copy of its data located directly inside the host and only waits for data on another host should it not be available locally. I would argue that is very un-SAN like, but I will admit we’re just playing a semantics game at this point.

vsan-protection

[symple_box color=”yellow” text_align=”left” width=”100%” float=”none”]
If I were in charge, it would be the Spongebob Storage System or “S3” for short – er, wait, that’s taken too, isn’t it? This is hard. 🙁
[/symple_box]

The Death of VSA

The VMware Virtual Storage Appliance (VSA) was a bit of a flop in the market, as it was woefully expensive to the intended target audience (smaller organizations) and required three hosts. Most of the small shops I work with run two, or if they are a larger organization like an SME (Small / Medium Enterprise) their remote offices only ran two. It was almost always cheaper to just acquire a DAS solution, like the HP MSA, and use SAS connections for a cheap shared storage layer. Or just acquire a multi-node system with a shared backplane that had front-side disk bays (think the Dell VRTX but older and a bit clunkier).

Hearing that VSAN would ultimately be replacing VSA at VMworld US is a good thing. But it sent the wrong message. It made people think VSAN = VSA = for small business.

ihasasad

The idea of using VSAN for SMB environments could fulfill some niche use cases, but I doubt that is the end game. Namely because of the same 3 host requirement (and 4 hosts if you ask Duncan Epping). I worked for a handful of SMBs and we typically had 2 hosts for HA (one place did have 3 hosts due to growth).

There’s a truckload of documentation coming out on VMware VSAN – and what beleaguered SMB admin wearing 17 hats will have time to read any of that? Aren’t there easier ways to satisfy storage for the SMB market – from the lower end HP and Dell arrays to the upper end Synology and QNAP arrays – or are we measuring these solutions as “too difficult” to implement for the jack-of-all-trades admin?

[symple_box color=”yellow” text_align=”left” width=”100%” float=”none”]
Note: I was this guy at one point, so I tend to waffle on how difficult shared storage is to design and consume.
[/symple_box]

Forward Thinking Use Cases

As I discussed at the VMworld panel, the idea of distributed server-side storage is nothing new. I recall visiting the Nutanix office years ago to hear about their product, which at a high level (and via their hardware platform) accomplishes a similar goal. SimpliVity ships a similar architecture with their own twists. Scale Computing is another contender aimed squarely at the SMB and mid-market space that has been on the market for quite some time and doesn’t use vSphere. If anything, the introduction of VSAN validates a further thirst for SME and Enterprise consumers to delve deeper into an architecture that has no external storage array for virtual workloads to run on. And I’ve talked with other vendors still in stealth who are going to continue to push the envelope.

[symple_box color=”yellow” text_align=”left” width=”100%” float=”none”]
Note: I still aboslutely feel that many consumers, if not all of them, will still have some sort of external array for archive, backup, file servers, or other “big stuff that is cold” storage.
[/symple_box]

choiceOne major advantage that VMware VSAN brings to the table today is the ability to allow the consumer to choose his or her own compute platform. You’re still limited to VMware vSphere – which in my mind isn’t necessarily a bad thing, but I’m sure other vendors will tout multiple hypervisor compatibility as a checkbox item. It gives you the choice to design using just about any rackmount compute platform you own or desire to purchase. I don’t really see this as a play for the blade server market due to the very limited quantity of drive slots, but people may end up proving me wrong.

I would also give the nod to VMware VSAN at being relatively easy to deploy and consume in the right scenario. Having worked with it, there are definitely scenarios where it can get complex in a hurry (having existing disks in use, using manual assignments, picking the right balance of HDD to SDD, etc.). But there isn’t much “carving” to do on the disk pool, which is nice.

Thoughts

Much like the back and forth swings between client-server and mainframe, it seems like we’re once again swinging the pendulum of technology. Having an external shared storage array was once lauded as the revolution that would solve all of our problems, with virtualization shining a pretty strong light on the weak spots.

Does an entirely new architecture need to dominate the rest, or can we have data centers that go with server-side storage and others with shared array storage? Or a data center with a hybrid mixture of the two? I’d say that there are enough workloads that play well in both designs that it’s too early to tell, but I certainly like seeing all the innovation around solving storage bottlenecks and all the fun places to stick spinning rust, flash, and DRAM. Perhaps location of the storage (server-side vs external array) will become the new tiering?

[symple_box color=”red” text_align=”left” width=”100%” float=”none”]
Do you have big plans for the VMware VSAN once it exits the beta stage?
[/symple_box]