Exocompute – Granular File Recovery for AWS Resources

Rubrik’s Polaris SaaS platform is designed to protect and recover a wide range of resources in the cloud. For Amazon Web Services (AWS), this often equates to Elastic Compute Cloud (EC2) instances and Elastic Block Storage (EBS) volumes. Recovery options range from entire resource replacement, in-region or cross-region exports (clones), and file / folder recovery.

This post will look deeper into the technology behind file indexing and file recovery using a unique Rubrik framework called Exocompute that is now Generally Available (GA) for AWS!

Protecting Cloud Native Applications

Polaris uses AWS regional API endpoints to protect cloud native applications, leveraging methods such as CreateSnapshot or CreateImage. Those familiar with VMware’s vSphere Storage APIs for Data Protection, which was formerly known as the vSphere APIs for Data Protection (VADP), will find the concept similar: workload data is ingested and stored for later restoration by whatever service is using the API. The difference, however, is that these regional API endpoints are highly available, fault tolerant, secured, and independent from other AWS regions, aligning nicely to the AWS Well-Architected Framework.

New to Polaris for Cloud Native Applications? Check out my How to Protect and Recover AWS EC2 Instances and EBS Volumes with Rubrik video!

The resulting resource snapshot data is retained in the customer’s AWS accounts. Lifecycle management of this snapshot data is handled by Polaris. The end result is that all of the heavy lifting – expiration of old snapshots, replication of snapshots to target regions, and so forth – is handled automatically by a highly available service designed for this purpose.

Introducing Exocompute

Customers regularly have a requirement to provide file level recovery for protected AWS resources. This requirement is met by the Polaris Exocompute framework, designed to provide indexing and file recovery features to cloud native workloads. By using an ephemeral cluster of containers to attach, scan, read, and store index metadata, Polaris is able to offer granular file and folder recovery for a variety of workloads in AWS.

An overview of this design from a networking resource perspective is shown below:

In AWS, Exocompute uses an EKS Cluster within an Auto Scaling Group to launch worker nodes. These nodes process snapshots to acquire indexing metadata when required. Upon the completion of all indexing tasks, the EKS Cluster is terminated to save on resource costs. The on-demand nature of Exocompute is truly built for the cloud.

All customer data remains in the customer’s AWS account(s). Polaris acts as the control plane and is provided only the filesystem index (metadata) to present future recovery operations to the customer.

Exocompute Deployment into AWS

Each AWS Region where Exocompute should be deployed requires a VPC and two subnet values. In the example below, the customer is using a dedicated VPC and two internal (private) subnets that will use an AWS NAT Gateway to reach the AWS Internet Gateway (IGW). I’ve also created an example HashiCorp Terraform plan here.

Note: If multi-AZ resiliency is required, the customer can configure a NAT Gateway for each private subnet with unique route tables.

The customer provides the above VPC and subnet information to Exocompute for each AWS region required. Below is an example of what this looks like for US East 1:

Once the VPC and subnet values are supplied, the customer is able to enable file recovery for any instances or volumes that require this solution in the configured AWS region(s). The default value is Disabled.

Index Metadata Creation

Polaris will periodically poll for eligible snapshots that require indexing in the configured AWS region(s). When eligible snapshots are found, Exocompute will temporarily create a new EKS Cluster using the configuration values from earlier. Each EKS Cluster is configured with a randomized suffix and requires zero customer effort to setup, maintain, or terminate.

Note: If an active Exocompute EKS Cluster already exists, Exocompute will use it to complete additional indexing jobs.

For each AWS resource that has snapshots eligible for indexing, associated events will appear in the Event Log under the Index event type. The event status will remain at Waiting for an Exocompute cluster to be ready until Exocompute marks the EKS Cluster as ready. The awaiting index job associates to the available EKS Cluster. Indexing work proceeds with details shown in the Event Log.

The snapshot data now has associated index metadata that can be used for global search and granular file and folder recovery.

Summary

Polaris Excompute is an elegant solution for file indexing and file recovery for protected AWS resources. There is no heavy lifting or management required by customers. The service uses highly available and fault tolerant API endpoints that provide a quick, efficient, and cost effective process for customers to meet granular recovery design requirements. Check it out!

Next Steps

Please accept a crisp high five for reaching this point in the post!

If you’d like to learn more about Cloud Architecture, or other modern technology approaches, head over to the Guided Learning page.

If there’s anything I missed, please reach out to me on Twitter. Cheers! 🙂