There’s been a lot of interesting discussion floating throughout the web on PernixData, a recently “unstealthed” startup (read some great day zero posts by Duncan Epping, Frank Denneman, or Howard Marks) that promises to deliver some very interesting results using server-side SSD drives and PCIe cards. I was curious to see how this technology worked and wanted to share with you all here, so I requested a technical demonstration at VMware Partner Exchange (PEX) with two sharp Systems Engineers on the PernixData team, Andy Daniel and Charles Gautreaux, to answer questions and show a live environment that used their technology.
My main concern was over data protection to the flash tier, as the challenge for most designs revolves around the loss of the local hardware, but also in how the solution was presented to the vSphere hosts. Have no fear; the team has some very creative ways at squashing my concerns, so let’s just dig right in.
What Makes This Technology Different?
To begin, let’s discuss the technology behind PernixData’s FVP (Flash Virtualization Platform). The idea is to virtualize the server-side, local flash storage and treat it as a virtual storage pool across the physical hosts for use by data in motion. As flash devices are discovered in the vSphere hosts, FVP can expand the virtual flash pool to incorporate new devices or new hosts regardless of the make or model, making it a solid scale-out technology. From an administrative perspective, the virtual flash pool does not appear as a datastore to the vSphere hosts – frankly, that would be messy to try and manage.
Instead, you get to choose which traditional datastores will be accelerated by the virtual flash pool and FVP handles the rest.
Write-Through vs Write-Back
To make things a bit more awesome, the virtual flash pool is not just for caching reads. It also handles writes in two modes: write-through and write-back. If you’ve not dealt with these terms before, the main difference is where the acknowledgement comes in for a write.
With write-through, a virtual machine writes to the flash, which then sends the writes back to the hard disk and has to wait for confirmation that the write was successful. This is very common for any stand alone flash device because it will at some point fail and some data may potentially be stranded on the card. It’s also why your typical RAID controller card will have a battery backup – to make sure that a power interruption won’t destroy the unacknowledged writes before they can be flushed to disk. Write-through mode is much slower: having to wait for the block to go to hard disk means you’re still operating at hard disk latency.
FVP has the ability to do write-back mode, which means that the flash device will acknowledge the write as being successful before it is ever sent to the hard disk. PernixData can do this because all writes done to a write-back accelerated workload will be replicated to two or more peers in the virtual storage pool. This protects the data from loss in the event that any single flash device fails.
I’ve drawn an example graphic below showing a simple pair of ESXi hosts in a cluster with FVP installed. Both have a single SSD flash drive installed and participating in the virtual flash pool.
To make things easy, let’s look specifically at VM2. As data moves between VM2 and the LUN (shown with gold arrows), the FVP layer acts as a read and write-back cache while also mirroring the write-back data to another host node (shown with a white arrow). This is done over the vMotion network by default, but you can change this to use a different network, or make a dedicated network, if desired.
If the virtualization administrator or DRS migrates VM2 from host ESXi A to B with a vMotion, the flash will already contain the hot write-back data and performance should remain relatively high. In the case of write-through data, FVP will allow the host to read from a peer host’s flash tier to avoid going back to the SAN until the local flash has warmed up.
One of my favorite things about PernixData is how incredibly simple it is to install and configure. vSphere Admins will rejoice to learn that it boils down to pushing a VIB file via VMware Update Manager (VUM). This can be done while the host is online and without a reboot. Upgrades should be a straight forward set of steps: put the host in maintenance mode, push out the new VIB, upgrade, and leave maintenance mode. I brought up the idea of trying to do this with a live host during a conversation with Raj Jethnani, VMware SE in Chicago, but we all agreed it would risky at best to try and fiddle with a live storage system with active workloads. 🙂
The post installation configuration is also very easy to do. While the end state GUI is not yet finalized, it was essentially a process of check the box next to each SSD and/or PCIe card to add into the flash pool and then choose which datastores to accelerate. You can then easily see an array of statistics separated by those served by the FVP flash tier, the SAN, and a total, including: IOPS, latency, throughput, and active VMs.
Here’s a sneak peak at the IOPS counter during my live demonstration. All of the reads and writes from the IOMeter test workload were being served from local flash (orange) with only a trickle hitting the back end disk (light purple line along X axis). There’s obviously a bit of warm up time for the cache and your IOPS value will fluctuate depending on how powerful your flash device is.
I’m impressed with how far this company has come after just a brief demonstration at VMware PEX. I especially admire how simple the solution is for an end user. There was obviously a lot of thought put into the process and input from those who have to manage a vSphere environment. We also gave it a pounding with the VMware I/O Analyzer tool and saw some impressive numbers, and I’m looking forward to taking that the next step and putting some realistic workloads on it to see where it shines.
I’d also like to mention that PernixData will be presenting at the upcoming Storage Field Day 3, April 24 – 26, in Denver, Colorado. Make sure to tune into the live stream and follow the information that will be pouring out of the fingertips from all of the delegate bloggers (myself included!).