A Discussion on Boot From SAN for vSphere

Boot from SAN is the practice of booting a server from your SAN environment instead of local disks. It’s often lauded as a great feature for a server environment, as the most common failure point in any server is typically the hard drive(s). By shifting that risk to a SAN, which is already highly redundant, monitored, and cared for, the thought is that you better protect your server infrastructure. Additional benefits for converged blade infrastructure (such as HP BladeSystem or Cisco UCS) is the ability to make a blade stateless – the blade can be replaced while still retaining its host identity, as it simply reaches out to the SAN to boot up.

While these are all great benefits, what about the costs, complexity, and risks? Also, how realistic are these benefits for a VMware environment? Is Boot from SAN really the best way to go for a vSphere host? In this post, we’ll examine these concerns for a vSphere host and see just what makes sense.

Note: I’m specifically leaving out VMware’s Auto Deploy technology in this discussion, as I feel it is worth its own post.

No Free Lunch

Booting from SAN isn’t a check box that you click. As with most anything in the IT world, it requires a decent amount of architecture and design work. Here are some things to consider:

  • Is your SAN currently configured for the zoning necessary to provide each host access to the boot LUN?
  • Have you come up with a design on how you want boot LUNs to be accessed over your SAN? You want to make sure paths will not be saturated.
  • Do you have storage that has the capacity and throughput to be shared for boot LUNs, or available disks to create an isolated storage pool for your boot LUNs?
  • Have you designed your storage array to properly present the boot LUNs to each host? Typically, LUN 0 is presented as the boot LUN, so it’s wise not to present any other LUN as 0 unless you are going to pick a different LUN number (assuming your booting HBA supports this).

This isn’t the end-all-be-all of considerations, but it’s a start. Assuming you’re up to the challenge of properly presenting storage to each host, you’re in the clear. The costs (time, effort, equipment) and complexity (additional layer of configuration and management) of the above are all introduced with boot from SAN.

The major risk with booting from SAN is that your entire set of hypervisors into the hands of the array. For you, this can be a good thing (you’re a SAN ninja), or a bad thing (your SAN is a scary place that you don’t like to visit). If the array suffers storage contention on your boot LUNs, or someone goofs up the fabric, you can potentially lose your entire vSphere environment. As a counterpoint, I’m going to assume that your virtual machines are also running on the array, so having your hypervisors crash may be of no consequence if you also have all your VMs crash as well.

The typical SAN ninja. No, wait, this is the wrong photo.

Level of Effort

One of the most tangible results of booting from SAN is the ability to quickly recover from catastrophic hardware failure. You can slap a new host into the environment and it will boot up and appear as the old vSphere host. But then again, we’re talking about a vSphere host – how unique is that when you really look at it? The level of effort to re-install ESXi onto a server, and then give it an IP, is relatively low. With the use of host profiles, distributed switches, and/or kickstart scripts, a lot of the remaining configuration work is already done.

Ask yourself a few questions:

  • Is it really worth going through all that zoning, storage provisioning, fabric design, and complexity for every host you use?
  • How much time savings are you going to realize when compared to installing and configuring ESXi when a host has a failure?
  • In a blade environment, is stateless computing something that you require?

It’s a lot of effort … is it worth it for you?

For other “bare metal” hosts, where you are installing Windows or Linux, there is a greater return on your investment. The installation and configuration requires a lot of time and effort. It’s harder to argue the same for ESXi.

Thoughts

Boot from SAN is something that should be evaluated from all angles, and is not by default a huge benefit to the typical vSphere environment. It adds a decent bit of complexity and time investment while having narrowly defined returns. While there isn’t anything technically wrong with using boot from SAN in a vSphere environment, it’s important to understand that vSphere is already designed to be lightweight and easily replaced. Additionally, modern server technology can easily have a pair of SD cards or SSD drives that are mirrored to protect the data, reducing a reliance on spinning disk.

I also have a post up on the Ahead blog with an alternate perspective on boot from SAN that focuses on the positive aspects.