Terraform Plans, Modules, and Remote State

Embracing infrastructure as code is a journey. It starts with learning how to work with a tool as a single user deploying a handful of resources. Once mastered, the need for reusable code in a collaborative setting becomes a priority. One of my favorite infrastructure as code tools, as mentioned in The Home Lab Gets a Home post, is Terraform.

New to Terraform? Rebecca Fitzhugh just published a great post entitled Hello, Terraform.

In this post, I will walk you through the creation of a simple Terraform plan to deploy an Amazon Web Services (AWS) Simple Storage Service (S3) bucket. I then change the code structure to create a module and show you how the Terraform workflow is modified. Finally, I’ll describe how to setup a remote backend to store state for collaboration.

Simple Terraform Configuration

When learning a new technology, it is best to start simple. I began using Terraform with a single, simple plan to configure a handful of resources. This type of plan usually aligns to an infrastructure component or a desired use case, such as building a bucket in AWS S3 to store files and artifacts related to other projects.

I’ve explored various ways to store Terraform files. My preferred method is to create a folder for each component or use case that I want to control. Within that folder sits a source folder to store my Terraform files. I repeat this structure across all projects for consistency.

When invoked, Terraform will look in the source folder and merge all of the configuration files together. The purpose of splitting the Terraform configuration across multiple files is to make reusable chunks of code that are easier to work with, error check, and iterate upon.

Note: It is possible to store all of the Terraform configuration into a single main.tf file. Splitting them is for improved human user experience.

Below is how I think of the files:

  • What do I want to build? The main.tf file contains resources, such as aws_s3_bucket, that I am declaring for Terraform to configure. The file should be as stateless as possible, meaning no hard coded information, secrets, or addresses.
  • What details will I need for building? The vars.tf file lists any variables I wish to declare, including optional default values, that will be used to customize my specific resources.
  • Who can build my resources? The provider.tf file contains information on the provider of my resources. In this case, the provider in AWS.
  • What was built? The terraform.tfstate file stores information on all resources that have been built by Terraform, including id values. This is the “critical receipt” from my provider listing everything I constructed using Terraform.

Below is a visual representation of the relationships between these files:

At this point, Terraform does not know the values of the defined variables. If I run terraform apply, the console will interactively request values for the variables. This is fine for a quick test or demonstration, but is not a viable solution for automation.

Variable values can be saved into a terraform.tfvars file (not shown above) and placed in the same directory as the other files. Terraform will give any variable values found in terraform.tfvars over to variables declared in the vars.tf file. Information in the terraform.tfvars file should be considered sensitive and protected accordingly.

Here is a visual representation of the terraform.tfvars values being supplied to the variables declared in vars.tf:

Terraform uses all of the information from the main.tf, provider.tf, vars.tf, and terraform.tfvars files to build the declared resource(s) in AWS. In the example above, I have stated the the value for aws_bucket_name should be wahlnetwork-bucket-prod.

When I again invoke terraform apply, the console no longer asks for input. Instead, Terraform has used the value from the terraform.tfvars file on my behalf. I have highlighted the values below in blue:

I now have an S3 bucket constructed and maintained by Terraform. While this scenario is simple, it is not viable if I wish to scale the number of resources being deployed and managed. I would have to copy the code to a new folder, update the terraform.tfvars file with information about the new bucket, and create silos of code for each bucket.

This is where Terraform modules step in.

Terraform Modules

The code found in main.tf, vars.tf, and provider.tf is valuable and can be reused across multiple resources. In this scenario, I desire the creation of several different S3 buckets with unique names to meet my prod, test, and dev needs. Using a Terraform module allows for code reuse and is extremely flexible in nature, although it can be a little confusing at first look.

In order to make the previous code into a module, I have created a new environment folder that is a peer to the source folder. Inside the environment folder is a folder for each environment: prod, test, and dev.

Below is an example of the topology:

Within each environment folder is a single main.tf file and the terraform.tfstate file. In order to reference the source folder as a module, I used the module statement in the main.tf file as shown below in the dev environment. These values are sensitive in nature and should be handled in a secure manner or given to a secrets manager to maintain.

module "s3-bucket" {
  source = "../../source"
  aws_access_key  = "1234567890"
  aws_secret_key  = "ABCDEFGHIJ"
  aws_region      = "us-east-1"
  aws_bucket_name = "wahlnetwork-bucket-dev"
}

Using the module object in the file tells Terraform to look in the source location, which is found at ../../source because it is two directories above the present working directory. Terraform will use the configuration found in the source folder to construct the dev bucket.

Below is an expanded visual showing the new relationship between the main.tf file (dev environment version) and the module being called:

Terraform will hand over the variable values to the module and the plan will execute as usual. Note the blue highlight where the dev version of the main.tf file has supplied a value for the aws_bucket_name variable.

Creating additional S3 buckets is now scalable. I have placed a main.tf file inside of the dev, test, and prod folders with information on each bucket. If more buckets are required in the future, more folders and main.tf files can be generated with ease.

One challenge remains: storing state. Recall that terraform.tfstate is my “critical receipt” for objects created by the provider. Terraform needs this file to understand the current state of objects under management. The file is currently local and stored in my laptop.

The ideal solution is to use a remote storage location for the state file.

Remote Backend for State

Terraform can use a remote storage location, called a remote backend, for state. This has several advantages over a local state file: collaboration with peers, high availability, and version control to name a few. In this scenario, I’ve chosen to use an AWS S3 bucket to store remote state due to the high availability of AWS services and the 99.999999999% durability SLA found with S3.

The topology now looks as shown below. I’ve simplified the diagram by not showing all of the environmental folders; however, they are still there.

Here is an explanation of what is going on with this structure:

  • The terraform.tfstate file is stored (and retrieved) from an S3 bucket. When setting up a backend for a configuration that already has a state file, Terraform will prompt to migrate state data.
  • A DynamoDB table is provisioned to store a lock. The lock is active when someone has checked out the state file and is in the process of making changes to the Terraform configuration. I will not be going deeper into locking in this post.
  • A S3 Backend Credentials file is used to supply AWS my user information to write to the backend S3 bucket. Do not confuse this with the provider.tf file – they are different! See the “Create a Shared Credentials File” page from AWS for more details on how to generate credentials.

In order for Terraform to use S3 as a backend, I used Terraform to create a new S3 bucket named wahlnetwork-bucket-tfstate for storing Terraform state files. Wild, right? 🙂

With this done, I have added the following code to my main.tf file for each environment. I saved the file and ran terraform init to setup my new backend.

terraform {
  backend "s3" {
    bucket = "wahlnetwork-bucket-tfstate"
    key    = "dev/terraform.tfstate"
    region = "us-east-1"
  }
}

The message when running terraform init is shown below:

Here is information on each key/value pair being supplied to the backend:

  • Bucket is the name of the S3 bucket.
  • Key is the path to where I store the state file. In this scenario, I’ve configured Terraform to create a folder named dev and store the terraform.tfstate file inside.
  • Region is the region where I created the S3 bucket.

My laptop will now retrieve state from S3 prior to running a Terraform plan and upload any changes to the state file after the plan completes.

Summary

In this post, I walked you through the creation of a simple Terraform plan to deploy an AWS S3 bucket. I then changed the code structure to create a module and showed you how the Terraform workflow is modified. Finally, I described how to setup a remote backend to store state for collaboration.

Next Steps

Please accept a crisp high five for reaching this point in the post!

If you’d like to learn more about Infrastructure as Code, or other modern technology approaches, head over to the Guided Learning page.