Transparent Page Sharing Vulnerable, Yet Largely Irrelevant

There’s been a good deal of fuss over the new Transparent Page Sharing (TPS) vulnerability that researches have found, and for good reason. Under the correct circumstances, data can be stolen, which is sad panda. Eric Siebert goes into a detailed post over the how and why that is worth a read. However, I lean towards disagreement that this is a “big deal” for the vast majority of vSphere shops.

While the vulnerability itself is certainly something to take notice of, the reality is that TPS died off quite a long time ago with the use of large memory pages in the hypervisor. Specifically:

In hardware-assisted memory virtualization systems, ESX will preferentially back guest physical pages with large host physical pages (2MB contiguous memory region instead of 4KB for regular pages) for better performance. If there is not a sufficient 2MB contiguous memory region in the host (for example, due to memory overcommitment or fragmentation), ESX will still back guest memory using small pages (4KB). ESX will not share large physical pages because:

  • The probability of finding two large pages that are identical is very low.
  • The overhead of performing a bit-by-bit comparison for a 2MB page is much higher than for a 4KB page.

However, ESX still generates hashes for the 4KB pages within each large page during page scanning. (source)

In reality, most everything released in the past decade supports (and encourages) large memory pages – heck, here’s an issue of MSDN Magazine discussing it for Windows Server back in 2001. This is simple to view in your vSphere environment – just look at the shared common memory value in any ESXi’s advanced performance chart for memory. It will likely be the incredibly small and irrelevant number with a seemingly flat line. Or, read this vSphere 5.5 pub doc that talks more about the various memory usage values.

Here’s an example on an ESXi host running 11 VMs in the lab. Of the 2.59 GB (2,721,304 KB) of active memory, about 35 MB (36,508 KB) of it is shared via TPS. Turning off TPS would result in needing about 1% more ESXi host memory for the guests to munch on. Not a huge deal by most standards.

Shared Common Memory

 

But, For Memory Over-Commitment …

If you’ve bet the farm of memory over-commitment, or have disabled large pages for deployments within a highly-shared memory environment, then you’re due for a risk versus reward inspection. Validate if the risk of leaving TPS enabled is worth the CapEx investment to add memory to your hosts. And, as always, watch out when you patch. 🙂

Both Josh Odgers and Frank Denneman have written their thoughts on their blogs, which I encourage reading.