The server-side caching space is definitely an area that I have been paying a lot of attention to for accelerating virtualized workloads. Beyond the cool factor, the idea of accelerating IO at the server layer seems like a smart play for any IT shop looking to extend their capital investment in compute and storage, while also simplifying the operational expenses and overhead associated with trying to find a performance needle-in-a-haystack that comes with many net-new workload types. It reminds me of the days when introducing virtualization was still new and fresh, and the idea of consolidation became an alluring proposal for similar reasons stated previously.
Based on reader feedback asking me to chat with Proximal Data, I spent some time a few weeks back talking about AutoCache 1.1 (which released in early May) with the Proximal Data team. The main focus of our conversation was to educate myself (and by proxy, you) on the offering, while also asking the hard questions about the competitive server-side cache market that is evolving and how they plan to differentiate from other offerings.
The AutoCache Magic
Today, Proximal Data is focused on being a strong caching solution in the hypervisor layer. Much like other solutions I’ve written about, AutoCache can leverage just about any PCIe or SATA based SSD device inside of a server. However, the company has established a Hardware Compatibility List (HCL) with a list of devices that they recommend based on performance. This makes sense, as not all flash devices are created equal, and some will definitely result in some peppier performance numbers than others. Proximal Data works off the premise of a “Do no harm” methodology to the workload, and strives to ensure that “it does not add to administrative load, does not negatively impact performance when it cannot accelerate it, and does not consume significant amounts of the very compute resources it is trying to make available for further consolidation” – I’ll go deeper into this later, but suffice to say that providing read cache, write-through cache, and write-around cache all focus on this strategy.
AutoCache integrates rather seemelessly with VMware vCenter and supports all of your favorite VMware features – File (NFS), Block, vMotion, HA, DRS, ESXi 4.1 thru 5.1, etc. – which has become table stakes for these sorts of solutions. A new tab, entitled AutoCache, appears in the vSphere Client. It provides a large volume of very helpful data – such as cache usage, the amount of reads/writes entering the system, cache hit %, and much more. This is great for providing details around performance for the IT folks and showing savings and value to the business folks.
Who’s Hot, and Who’s Not
Flash inside of a server is a limited and precious resource, and it’s really up to the caching software to determine what blocks are hot and need to be cached. Unlike a simple cache that utilizes most recently used (MRU) or most frequently used (MFU), AutoCache has a number of different indexing algorithms at its disposal because the team feels that no one algorithm is the best all the time. The metadata index is kept entirely in server RAM, with an advertised cache consumption size of less than 0.001% of the managed flash device capacity. That’s about 500 MB of RAM consumed for a 512 GB flash device – this is an incredibly small footprint for any virtualization server to worry about.
There are some rather large advantages to using RAM over placing the index on the flash device itself. First, performance in RAM is typically a magnitude or so better. Second, not having to hit the flash device to query the metadata means less overhead on the flash device, and thus more performance headroom available for workloads. As an added bonus, avoiding writing to the flash device helps improve the device lifespan.
If a workload feels the need to migrate via vMotion, AutoCache will initiate a pre-warming activity that prepares the target server for the virtual machine. The source host will invalidate the hot data once the virtual machine has successfully moved over, which immediately frees up capacity on the source device for any other workloads that may benefit. I was given a preview as to just how the pre-warm vMotion sausage was made, although I’m not sure it’s public knowledge. Let’s just say that I approve of the method and feel it’s rather clever / safe. 🙂
But Why Not Write-Back Caching?
This was my million dollar question. Fortunately I had the opportunity to ask the CEO, Rory Bolt, this exact question during our “Considerations for Caching with Flash in the Hypervisor” chat. It sounds like the largest challenge is properly protecting the workload’s data and performance – back to the “do no harm” ideal earlier in this post. Write intense workloads will ultimately have to flush to disk at some point, or at the very least, be acknowledged by a mirrored host as a “data off box” copy elsewhere on the network. There are penalties here – such as the round trip time (latency) to acknowledge with that off box host and the bandwidth to transmit the bits.
From a data perspective, write-back caching does offer a few caveats when trying to perform backups or replication. Because the storage array does not yet have the dirty pages that need to be flushed, it is out of sync from the reality on the virtual machine. Any solution that attempts to sit in the middle of a hypervisor and storage array would need to be in the loop on any storage snapshot or LUN replication jobs. If shooting for a simple solution, it may be non-trivial to work in some scripts to swap into a write-through mode temporarily during a backup or snapshot operation, and then back again.
Mr. Bolt did say that write-back caching is ultimately on the road map and will be offered by Proximal Data at some point – he wanted to make everyone aware of the trade-offs first.
I had a great discussion with the folks at Proximal Data. Their use of several index algorithms for handling the metadata, along with the placement and size of the metadata into RAM, seem like sharp ideas. They also offer a very competitive price point that will make their addition to most commercial and enterprise designs, at least for a subset of hosts, relatively trivial when comparing the OpEx and CapEx benefits. I look forward to seeing more announcements from this team as future builds are released.