Reconfigure a Terraform Backend for Rotated AWS Access Keys

I leverage several different Amazon Web Sevices (AWS) regions for my lab environment resources. Each region has a region-scoped IAM account with programmatic access keys. This limits the attack surface area to a single region should one of the access keys leak. HashiCorp Terraform uses these regional access keys to maintain a declarative state by way of numerous Terraform plans.

I rotate the regional access keys after no greater than 90 days and store the new keys in HashiCorp Vault. During this process, a few Terraform plans produced the error shown below:

Initializing the backend...
Error: error using credentials to get account ID: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
        status code: 403, request id: 1234567890

This error indicates that the state file stored in AWS Simple Storage Service (S3) was unable to be retrieved. Without valid credentials, Terraform fails gracefully because the state file cannot be validated before refreshing resource information. In short: the plan could not check the previous state of my cloud resources in order to see what had changed.

Troubleshooting the Terraform Backend

Terraform provides a simple mechanism for enabling logging: set the session’s TF_LOG environmental variable to trace. A PowerShell 7 console running on Windows 10 requires the command below:

$env:TF_LOG='trace'

Once enabled, I ran terraform init to see why the credentials were invalid. Note that <<AWS_ACCESS_KEY>> has been redacted.

---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: sts.amazonaws.com
User-Agent: aws-sdk-go/1.25.3 (go1.12.13; windows; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.24
Content-Length: 43
Authorization: AWS4-HMAC-SHA256 Credential=<<AWS_ACCESS_KEY>>/20200501/us-east-1/sts/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date, Signature=x
Content-Type: application/x-www-form-urlencoded; charset=utf-8
X-Amz-Date: 20200501T143723Z
Accept-Encoding: gzip
Action=GetCallerIdentity&Version=2011-06-15
-----------------------------------------------------
2020/05/01 09:37:24 [DEBUG] [aws-sdk-go] DEBUG: Response sts/GetCallerIdentity Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 403 Forbidden
Connection: close
Content-Length: 306
Content-Type: text/xml
Date: Fri, 01 May 2020 14:37:22 GMT
X-Amzn-Requestid: 1234567890
-----------------------------------------------------
2020/05/01 09:37:24 [DEBUG] [aws-sdk-go] <ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <Error>
    <Type>Sender</Type>
    <Code>InvalidClientTokenId</Code>
    <Message>The security token included in the request is invalid.</Message>
  </Error>
  <RequestId>1234567890</RequestId>
</ErrorResponse>
2020/05/01 09:37:24 [DEBUG] [aws-sdk-go] DEBUG: Validate Response sts/GetCallerIdentity failed, attempt 0/5, error InvalidClientTokenId: The security token included in the request is invalid.
        status code: 403, request id: 1234567890

For some reason, the credentials were declined by AWS as revealed by the status code “403 Forbidden” coupled with “The security token included in the request is invalid.

I manually tested the new access keys with awscli without any issues. My next idea was to look at my source code and see if state was being retained. I soon realized that the backend state was, in fact, being stored locally!

This particular Terraform plan was run prior to setting up an S3 backend. For some reason, my local state file persisted with a Terraform backend block. Inside the state file was the old access keys. My next thought was to look for a method to nullify the state values.

Terraform Init with Reconfigure

It turns out that reconfigure is the parameter to clean up my backend configuration. This will “reconfigure the backend, ignoring any saved
configuration.”

I ran terraform init -reconfigure and noticed the local state file change in git. The serial, access_key, and secret_key values were modified as shown below:

With the access keys cleared from the local state file, Terraform once more looked to my .aws credentials to gather the current (and valid) access keys. Success!

Next Steps

Please accept a crisp high five for reaching this point in the post!

If you’d like to learn more about Infrastructure as Code, or other modern technology approaches, head over to the Guided Learning page.