Developer tools

Testing Infrastructure as Code in 2026: from Terraform validation to Terratest integration

IaC testing is no longer optional. Between configuration drift, resource ordering, and state file corruption, production infrastructure demands the same rigor as application code.

3/13/20266 min readDev tools
Testing Infrastructure as Code in 2026: from Terraform validation to Terratest integration

Executive summary

IaC testing is no longer optional. Between configuration drift, resource ordering, and state file corruption, production infrastructure demands the same rigor as application code.

Last updated: 3/13/2026

Introduction: IaC without testing is technical debt

The industry has largely accepted that Infrastructure as Code (IaC) is the standard for cloud infrastructure. Teams write Terraform configurations, store them in Git, and treat them like code. But there is a glaring omission: most organizations apply minimal testing discipline to their IaC, despite the production impact being significantly higher than a buggy application feature.

When a bug in application code causes a 500 error, users see an error page and developers investigate. When a bug in Terraform code causes a production incident, the entire platform can become unavailable — and rollback might require manual intervention in the AWS console because the state file itself is corrupted.

In 2026, treating IaC as code without applying the same testing rigor is indefensible. This post covers a practical IaC testing strategy for teams using Terraform, from simple validation scripts to full Terratest integration.

The IaC testing pyramid

Just like application testing, IaC testing should follow a pyramid with distinct layers:

Testing LevelPurposeToolsFrequency
Static AnalysisCatch syntax errors, security issues, and best practice violationsterraform fmt, tflint, tfsec, checkovEvery commit (pre-commit)
Unit TestingValidate that Terraform configuration produces the expected stateterraform plan -out, terraform validateEvery commit
Integration TestingVerify that infrastructure provisions correctly in a sandbox environmentterratest, kitchen-terraform, terraform-awsEvery merge to main
End-to-End TestingValidate that the provisioned infrastructure works for the intended use caseterratest + application deploymentEvery release

The goal is to catch issues as early as possible. A typo caught by tflint costs nothing to fix. A resource dependency error caught during terraform plan costs a few minutes. A resource creation error caught in a staging environment costs an hour. A production outage costs your company revenue and reputation.

Static analysis: the first line of defense

Static analysis tools run instantly and catch the most common IaC errors before they ever reach a PR.

Terraform fmt and validate

These are the basics. Every repository should enforce them via pre-commit hooks:

bash#!/bin/bash
# pre-commit hook for IaC validation

# Check formatting
terraform fmt -recursive -check
if [ $? -ne 0 ]; then
  echo "Terraform files are not formatted. Run 'terraform fmt -recursive' to fix."
  exit 1
fi

# Validate syntax
terraform fmt -recursive > /dev/null
find . -name "*.tf" -execdir terraform init -backend=false {} \; -execdir terraform validate {} \;

TFLint

TFLint is a Terraform linter that finds errors that terraform validate misses:

hcl# example error caught by TFLint
variable "instance_type" {
  default = "invalid_type" # This will fail at apply time, but tflint catches it now
}

resource "aws_instance" "example" {
  instance_type = var.instance_type
}

TFLint rules include checking for invalid resource types, deprecated syntax, and provider-specific constraints like checking whether AMI IDs match the configured region.

Security scanning: tfsec and Checkov

Security scanners catch misconfigurations before they reach production:

tfsec is a Go-based security scanner that focuses on AWS, Azure, and GCP best practices:

bashtfsec ./terraform/

It catches issues like unencrypted S3 buckets, overly permissive security groups, and missing logging configurations.

Checkov is a more comprehensive scanner that supports multiple IaC formats (Terraform, CloudFormation, Kubernetes manifests) and includes policy-as-code capabilities:

bashcheckov -d ./terraform/ --framework terraform

Both tools can be integrated into pre-commit hooks and CI/CD pipelines, automatically failing the pipeline if critical security issues are detected.

Unit testing: validating Terraform output

Terraform's terraform plan command is effectively a unit test for your infrastructure. It shows what will change without actually making changes.

The "plan as test" pattern

A robust workflow commits the plan output as part of the PR review process:

bash#!/bin/bash
# CI job: generate plan output

terraform init
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json

The tfplan.json file is uploaded as a CI artifact. Reviewers can see exactly what resources will be created, modified, or destroyed. This prevents "oops, I deleted the production database" incidents.

Expected output validation

For critical infrastructure, you can validate that the plan output matches expectations using Python or Go scripts:

python# validate_plan.py
import json
import sys

with open('tfplan.json') as f:
    plan = json.load(f)

# Ensure no resources are destroyed
destroy_count = sum(1 for r in plan.get('resource_changes', []) if r.get('change', {}).get('actions') == ['delete'])
if destroy_count > 0:
    print(f"Error: Plan would destroy {destroy_count} resources")
    sys.exit(1)

# Ensure at least one resource is created
create_count = sum(1 for r in plan.get('resource_changes', []) if r.get('change', {}).get('actions') == ['create'])
if create_count == 0:
    print("Error: Plan does not create any resources")
    sys.exit(1)

print("Plan validation passed")

Integration testing with Terratest

Terratest is a Go library for testing infrastructure code. It provides utilities to deploy real infrastructure in a test environment, run tests against it, and then destroy the infrastructure.

Why Terratest instead of just running Terraform?

Running terraform apply in a staging environment is better than nothing, but it doesn't validate that the infrastructure actually works. Terratest allows you to:

  1. Deploy infrastructure to a test environment
  2. Wait for resources to be ready
  3. Run actual tests against the deployed infrastructure
  4. Teardown everything automatically

Terratest example: testing an EC2 instance

gopackage test

import (
  "testing"
  "time"

  "github.com/gruntwork-io/terratest/modules/aws"
  "github.com/gruntwork-io/terratest/modules/terraform"
  "github.com/stretchr/testify/assert"
)

func TestEC2Instance(t *testing.T) {
  t.Parallel()

  // Configure Terraform options
  terraformOptions := &terraform.Options{
    TerraformDir: "../terraform",

    Vars: map[string]interface{}{
      "instance_type": "t3.micro",
      "environment":  "test",
    },
  }

  // At the end of the test, run `terraform destroy`
  defer terraform.Destroy(t, terraformOptions)

  // Run `terraform init` and `terraform apply`
  terraform.InitAndApply(t, terraformOptions)

  // Get the instance ID from Terraform outputs
  instanceID := terraform.Output(t, terraformOptions, "instance_id")

  // Verify the instance exists and is running
  instance := aws.GetEc2InstanceByIdE(t, instanceID, "us-east-1")
  assert.Equal(t, "running", *instance.State.Name)

  // Verify the instance has the expected tags
  tags := aws.GetTagsForEc2Instance(t, instanceID, "us-east-1")
  assert.Equal(t, "test", tags["Environment"])
}

This test actually provisions an EC2 instance in AWS, waits for it to be running, and validates that it has the correct tags. If the test fails, Terraform automatically destroys the resources.

Terratest example: testing a Kubernetes deployment

gofunc TestKubernetesDeployment(t *testing.T) {
  t.Parallel()

  terraformOptions := &terraform.Options{
    TerraformDir: "../terraform/kubernetes",
    Vars: map[string]interface{}{
      "namespace": "test",
    },
  }
  defer terraform.Destroy(t, terraformOptions)
  terraform.InitAndApply(t, terraformOptions)

  // Get the Kubernetes endpoint
  endpoint := terraform.Output(t, terraformOptions, "endpoint")

  // Verify the application is responding
  retries := 10
  sleepBetweenRetries := 30 * time.Second

  url := fmt.Sprintf("http://%s/health", endpoint)
  http_helper.HttpGetWithRetry(t, url, nil, 200, "OK", retries, sleepBetweenRetries)
}

This test provisions Kubernetes resources and validates that the application is actually responding to HTTP requests. This catches issues like incorrect service configurations, missing ingress rules, or application startup failures.

Pre-commit hooks: catching errors locally

The most effective IaC testing strategy catches errors before they ever leave a developer's machine. Pre-commit hooks run automatically on git commit:

yaml# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.88.4
    hooks:
      - id: terraform_fmt
      - id: terraform_validate
      - id: terraform_tflint
      - id: terraform_tfsec
      - id: terraform_checkov
        args: ['--framework', 'terraform', '--compact', '--quiet']
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: check-merge-conflict
      - id: trailing-whitespace
      - id: end-of-file-fixer

Installing these hooks is straightforward:

bashpip install pre-commit
pre-commit install

Now, every commit automatically runs Terraform formatting, validation, linting, and security scanning. The commit fails if any issues are found, forcing developers to fix them before pushing.

CI/CD integration: automated testing at scale

Once code passes local validation, the CI/CD pipeline should run comprehensive tests:

GitHub Actions example

yamlname: IaC Tests

on: [pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3
      - name: Terraform Format Check
        run: terraform fmt -recursive -check
      - name: Terraform Init
        run: terraform init
      - name: Terraform Validate
        run: terraform validate
      - name: Run TFLint
        uses: terraform-linters/tflint@v0
      - name: Run tfsec
        uses: aquasecurity/tfsec-action@v1.0.0
        with:
          working_directory: ./terraform

  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3
      - name: Terraform Plan
        run: |
          terraform init
          terraform plan -out=tfplan
      - name: Save Plan Output
        uses: actions/upload-artifact@v4
        with:
          name: tfplan
          path: tfplan

Testing in ephemeral environments

For comprehensive integration testing, provision ephemeral environments that are created for each PR and destroyed after the tests complete:

yaml  integration-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Setup Go
        uses: actions/setup-go@v5
        with:
          go-version: '1.21'
      - name: Run Terratest
        run: go test -v -timeout 30m ./test/
        env:
          AWS_REGION: us-east-1
          TF_VAR_environment: ci-${{ github.event.pull_request.number }}

The TF_VAR_environment variable ensures that each PR uses a unique environment name, preventing conflicts between concurrent test runs.

Common pitfalls and how to avoid them

Pitfall 1: Testing infrastructure in production

Running IaC tests against production infrastructure is risky and unnecessary. Tests should use dedicated test environments or sandbox accounts.

Pitfall 2: Ignoring test cleanup

Integration tests that don't clean up resources create cloud cost bloat. Terratest's defer terraform.Destroy() pattern ensures cleanup even if the test fails.

Pitfall 3: Over-reliance on static analysis

Static analysis tools cannot catch logical errors like incorrect resource dependencies or missing variables. They should be one layer of a comprehensive testing strategy.

Pitfall 4: Testing only happy paths

Tests should also validate failure scenarios. For example, test that a security group correctly rejects unauthorized traffic, not just that it accepts authorized traffic.

Conclusion: IaC testing as operational discipline

In 2026, treating IaC with the same rigor as application code is not optional—it is operational hygiene. A comprehensive IaC testing strategy includes:

  1. Static analysis via pre-commit hooks for immediate feedback
  2. Plan validation as part of every PR review process
  3. Integration testing with Terratest for comprehensive validation
  4. Ephemeral environments for safe testing without production impact

The investment in IaC testing pays dividends in reduced incident rates, faster deployments, and increased confidence in infrastructure changes. When infrastructure changes are as safe and predictable as application deployments, engineering teams can move faster without sacrificing reliability.


Building complex infrastructure and need a robust IaC testing strategy that integrates with your existing CI/CD workflow? Talk to Imperialis DevOps specialists about implementing a comprehensive IaC testing framework for your organization.

Sources

Related reading