Skip to content
ioob.dev
Go back

Terraform Part 14 — Testing and Policy

· 6 min read
Terraform Series (14/15)
  1. Terraform Part 1 — What Is Terraform
  2. Terraform Part 2 — Installation and First Deploy
  3. Terraform Part 3 — HCL Syntax
  4. Terraform Part 4 — Variables and Outputs
  5. Terraform Part 5 — Providers
  6. Terraform Part 6 — Resources and Dependencies
  7. Terraform Part 7 — Data Sources and Import
  8. Terraform Part 8 — State Management
  9. Terraform Part 9 — Modules
  10. Terraform Part 10 — Loops and Conditionals
  11. Terraform Part 11 — Workspaces and Environment Separation
  12. Terraform Part 12 — Kubernetes and Helm Providers
  13. Terraform Part 13 — CI/CD Integration
  14. Terraform Part 14 — Testing and Policy
  15. Terraform Part 15 — Practical Patterns and Pitfalls
Table of contents

Table of contents

Why Infrastructure Needs Testing Too

We take testing application code for granted. So why don’t we test infrastructure code? It’s because the “just deploy it and see” mindset persists. But Terraform code has bugs, security gaps, and policy violations too.

These issues should be caught before they’re applied. Discovering them after deployment is already too late.

flowchart LR
    Code["HCL Code"] --> L1["Stage 1:\nfmt / validate"]
    L1 --> L2["Stage 2:\nStatic security analysis\n(Checkov, tfsec)"]
    L2 --> L3["Stage 3:\nPolicy validation\n(OPA, Sentinel)"]
    L3 --> L4["Stage 4:\nIntegration tests\n(Terratest)"]
    L4 --> Apply["apply stage"]

These stages differ in cost and speed. The further left, the faster and cheaper but narrower in scope; the further right, the slower and more expensive but verifying actual behavior.

First Things First — fmt and validate

The fastest checks Terraform provides out of the box. Run them always, both locally and in CI.

terraform fmt — Code style formatting

# Auto-format all files
terraform fmt -recursive

# In CI, only "check if there are unformatted files"
terraform fmt -check -recursive

A tool that ensures the entire team uses the same style. It automatically aligns indentation, spacing, and attribute ordering. -check returns a non-zero exit code if any files aren’t formatted, failing CI.

terraform validate — Syntax validation

terraform validate

Checks whether HCL syntax is correct, whether referenced variables and resources exist, and whether types match. Must be run after terraform init so the provider schema is available for validation.

Success! The configuration is valid.
Error: Reference to undeclared input variable

  on main.tf line 5, in resource "aws_instance" "app":
   5:   instance_type = var.size

It doesn’t catch everything — typos and logic errors slip through. But it’s the fastest and easiest first gate.

Static Security Analysis — Checkov and tfsec

Tools that parse HCL and check “whether this code violates security rules.” They can run without applying and without state, making them great for continuous use during development.

Checkov

Made by Bridgecrew (Palo Alto). Supports not just Terraform but also CloudFormation, Kubernetes, Dockerfile, and various other formats.

# Install
pip install checkov

# Run
checkov -d .
checkov -d envs/prod --framework terraform

Results look like this.

Check: CKV_AWS_24: "Ensure no security groups allow ingress from 0.0.0.0:0 to port 22"
	FAILED for resource: aws_security_group.web
	File: /envs/prod/main.tf:45-60

		45 | resource "aws_security_group" "web" {
		...
		55 |   ingress {
		56 |     from_port   = 22
		57 |     to_port     = 22
		58 |     protocol    = "tcp"
		59 |     cidr_blocks = ["0.0.0.0/0"]
		60 |   }

		Guide: https://docs.bridgecrew.io/docs/networking_1

It tells you which rule was violated, which file and line number, and even provides a guide link.

When you need to skip a specific rule, leave a comment in the file.

# checkov:skip=CKV_AWS_24:Internal network, allowed
resource "aws_security_group" "internal" {
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8"]
  }
}

Always leave a reason in the skip comment — that’s the principle. Overusing skip is the same as “checking nothing.”

tfsec

Made by Aqua Security. Being Terraform-specific makes it a bit lighter.

# Install (macOS)
brew install tfsec

# Run
tfsec .
Result #1 CRITICAL Security group rule allows ingress from public internet. 
──────────────────────────────────────────
  envs/prod/main.tf:55
──────────────────────────────────────────
   53    ingress {
   54      from_port   = 22
   55  →   cidr_blocks = ["0.0.0.0/0"]
   56      protocol    = "tcp"
   57      to_port     = 22
──────────────────────────────────────────
  ID         AVD-AWS-0107
  Impact     Your port is exposed to the whole internet
  Resolution Set a more restrictive cidr range

Checkov and tfsec overlap considerably. Some teams use both, some use just one. Start light with one and add more if needed. Personally, tfsec feels cleaner in output since it’s Terraform-specialized.

Policy Validation — OPA

Checkov and tfsec check predefined rules. Organization-specific policies (e.g., “all resources must have an Owner tag”) need to be expressed as custom policies. This is where OPA (Open Policy Agent) comes in.

OPA is a general-purpose policy engine that uses a language called Rego. It can be used with Kubernetes, Envoy, Terraform, and more. For Terraform, it’s commonly used through a wrapper called conftest.

First, get the JSON from terraform plan.

terraform plan -out=tfplan
terraform show -json tfplan > plan.json

Write the policy in Rego.

# policies/tags.rego
package main

required_tags := {"Owner", "Environment", "Service"}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"

    tag := required_tags[_]
    not resource.change.after.tags[tag]

    msg := sprintf("Instance %v is missing required tag %v", [resource.address, tag])
}

This policy means “all EC2 instances must have Owner, Environment, and Service tags.” Matching deny indicates a policy violation.

# Check
conftest test --policy policies/ plan.json
FAIL - plan.json - Instance aws_instance.app is missing required tag Environment

More complex examples are possible too, like “only certain instance types are allowed in production.”

package main

allowed_prod_types := {"m5.large", "m5.xlarge", "r5.large"}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_instance"
    resource.change.after.tags.Environment == "prod"

    instance_type := resource.change.after.instance_type
    not allowed_prod_types[instance_type]

    msg := sprintf(
        "Instance type %v is not allowed in production. Allowed: %v",
        [instance_type, allowed_prod_types]
    )
}

Codifying organizational policies lets anyone discover violations proactively. Questions like “What were the production rules again?” disappear.

Integration Testing — Terratest

The heaviest test. It actually creates cloud resources and verifies they work as intended. Written in Go.

// test/vpc_test.go
package test

import (
    "fmt"
    "testing"

    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestVpcModule(t *testing.T) {
    t.Parallel()

    terraformOptions := &terraform.Options{
        TerraformDir: "../modules/vpc",

        Vars: map[string]interface{}{
            "cidr_block":  "10.99.0.0/16",
            "environment": fmt.Sprintf("test-%d", time.Now().Unix()),
        },
    }

    // Always clean up when test ends
    defer terraform.Destroy(t, terraformOptions)

    // Run apply
    terraform.InitAndApply(t, terraformOptions)

    // Verify outputs
    vpcId := terraform.Output(t, terraformOptions, "vpc_id")
    assert.Regexp(t, "^vpc-", vpcId)

    subnets := terraform.OutputList(t, terraformOptions, "public_subnet_ids")
    assert.Equal(t, 2, len(subnets))
}

This test actually creates a VPC and subnets, verifies outputs match expectations, and cleanly deletes everything at the end.

cd test
go test -v -timeout 30m

The trade-offs are extreme.

Advantages

Disadvantages

That’s why Terratest is applied only to frequently used core modules. Looking at how many Terratest suites official open-source modules maintain gives you a good benchmark. Applying it to all code has a poor cost-to-benefit ratio.

terraform test — Built-in Test Framework

Starting from Terraform 1.6, there’s a native test framework. You can write tests in HCL alone, without Go.

# tests/vpc.tftest.hcl
run "valid_cidr" {
  command = plan

  variables {
    cidr_block  = "10.99.0.0/16"
    environment = "test"
  }

  assert {
    condition     = aws_vpc.this.cidr_block == "10.99.0.0/16"
    error_message = "VPC CIDR must match input value"
  }
}

run "creates_two_subnets_by_default" {
  command = plan

  variables {
    cidr_block  = "10.99.0.0/16"
    environment = "test"
  }

  assert {
    condition     = length(aws_subnet.public) == 2
    error_message = "Should create 2 subnets for the default 2 AZs"
  }
}

Run like this.

terraform test

With command = plan, it validates using plan only without apply. No actual resources are created, so it’s fast and free. Changing to command = apply creates real resources for testing (costs incur).

This built-in framework is ideal for simple validations like “do outputs change as expected with inputs” and “does conditional logic behave correctly.” Complex integration tests still favor Terratest, but everyday module testing is well-served by the built-in framework.

pre-commit Hooks — Catch Issues Locally First

Running basic checks at commit time locally, before CI, gives much faster feedback. Use the pre-commit tool.

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/antonbabenko/pre-commit-terraform
    rev: v1.88.4
    hooks:
      - id: terraform_fmt
      - id: terraform_validate
      - id: terraform_tflint
        args:
          - --args=--only=terraform_required_version
          - --args=--only=terraform_required_providers
      - id: terraform_tfsec
      - id: terraform_docs
        args:
          - --args=--output-file README.md

Installation and activation.

# Install (macOS)
brew install pre-commit

# Activate hooks
pre-commit install

# Manual run (all files)
pre-commit run --all-files

Now every git commit automatically runs fmt, validate, tflint, and tfsec. If there’s an issue, the commit fails. Catch problems on the developer’s laptop before they reach CI.

flowchart LR
    Edit["Edit code"] --> Commit["git commit"]
    Commit --> Hook["pre-commit hook"]
    Hook --> Check{"Checks pass?"}
    Check -->|"Fail"| Fix["Fix and re-commit"]
    Check -->|"Pass"| Push["git push"]
    Push --> CI["CI Pipeline\n(heavier tests)"]

tflint — Linter

tflint is a Terraform-specific linter. It provides far more rules than official checks.

brew install tflint
tflint --init
tflint

You can enable AWS provider rulesets via .tflint.hcl configuration.

plugin "aws" {
  enabled = true
  version = "0.30.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

rule "terraform_unused_declarations" {
  enabled = true
}

rule "terraform_deprecated_interpolation" {
  enabled = true
}

tflint catches invalid instance types, nonexistent AMI IDs, unused variables, and more.

Full Pipeline Configuration

Here’s an example integrating all the tools covered so far into CI.

# .github/workflows/terraform-ci.yml
name: Terraform CI

on:
  pull_request:
    paths: ['**/*.tf', '**/*.tftest.hcl']

jobs:
  checks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.8.0

      - name: Format
        run: terraform fmt -check -recursive

      - name: Init (all envs/*)
        run: |
          for dir in envs/*; do
            (cd "$dir" && terraform init -backend=false)
          done

      - name: Validate
        run: |
          for dir in envs/*; do
            (cd "$dir" && terraform validate)
          done

      - name: tflint
        uses: terraform-linters/setup-tflint@v4
      - run: |
          tflint --init
          tflint --recursive

      - name: tfsec
        uses: aquasecurity/tfsec-action@v1.0.3

      - name: Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          directory: .
          framework: terraform
          soft_fail: false

      - name: OPA Policy Check
        run: |
          for dir in envs/*; do
            (cd "$dir" && terraform plan -out=tfplan && terraform show -json tfplan > plan.json)
            conftest test --policy policies/ "$dir/plan.json"
          done

      - name: terraform test
        run: |
          for mod in modules/*; do
            if [ -d "$mod/tests" ]; then
              (cd "$mod" && terraform test)
            fi
          done

Each stage has different costs. fmt and validate are fastest (seconds), static analysis next (tens of seconds), OPA next (minutes, including plan), and actual Terratest is heaviest (tens of minutes). Ordering the pipeline this way ensures fast failure and saves overall time.

Tool Selection Guide

Many tools have been covered. What should you use?

PurposeEssentialRecommendedOptional
Syntax/styleterraform fmt, validatetflint
Static security analysistfsec or CheckovBoth
Organization policyOPA/conftestSentinel (Terraform Cloud)
Unit testingterraform test
Integration testingTerratest (core modules only)
Local integrationpre-commit

You don’t need to adopt everything at once. Build up incrementally.

  1. Start with fmt, validate, and pre-commit setup
  2. Add tfsec or Checkov to CI
  3. Add terraform test for frequently changing core modules
  4. Introduce OPA when organizational policies become clear
  5. Terratest only for truly critical modules

Infrastructure has quality standards too. “If it works, it’s fine” might pass once, but it won’t hold up at scale. Automated testing and policy validation form the foundation of trust in infrastructure code.

In the next part, we wrap up the series by compiling practical patterns and pitfalls. Directory structures, tagging strategies, common incidents, and large-scale migration — all in one sweep.

Part 15: Practical Patterns and Pitfalls


Related Posts

Share this post on:

Comments

Loading comments...


Previous Post
Terraform Part 13 — CI/CD Integration
Next Post
Terraform Part 15 — Practical Patterns and Pitfalls