Testing Infrastructure Code#
Infrastructure code has a unique testing challenge: the thing you are testing is expensive to instantiate. You cannot spin up a VPC, an RDS instance, and an EKS cluster for every pull request and tear it down 5 minutes later without significant cost and time. But you also cannot ship untested infrastructure changes to production without risk.
The solution is the same as in software engineering: a testing pyramid. Fast, cheap tests at the bottom catch most errors. Slower, expensive tests at the top catch the rest. The key is knowing what to test at which level.
The Infrastructure Testing Pyramid#
┌─────────────────┐
│ Integration │ Real cloud resources
│ (Terratest) │ Expensive, slow (10-30 min)
│ Run: nightly │ Catches: actual API behavior
┌┴─────────────────┴┐
│ Plan-Based │ Real plan output, no apply
│ (Conftest/OPA) │ Moderate (1-3 min)
│ Run: every PR │ Catches: policy violations
┌┴───────────────────┴┐
│ Cost Estimation │ Plan output → cost analysis
│ (Infracost) │ Moderate (1-2 min)
│ Run: every PR │ Catches: budget overruns
┌┴─────────────────────┴┐
│ Static Analysis │ No cloud access needed
│ (tflint, checkov, │ Fast (seconds)
│ terraform validate) │ Catches: syntax, config errors
│ Run: every commit │ Run: every commit
└───────────────────────┘Each level catches different classes of errors. Skipping a level means those errors reach the next level (which is slower and more expensive to run) or reach production.
Level 1: Static Analysis (Seconds)#
Static analysis checks code without executing it or connecting to any cloud API. It runs on every commit in pre-commit hooks or early in CI.
terraform validate#
Checks HCL syntax and basic resource configuration:
terraform init -backend=false # initialize providers without backend
terraform validate # check syntax and resource referencesCatches: missing required arguments, invalid resource types, broken references, type mismatches. Does not catch: values that are syntactically valid but logically wrong.
tflint#
Catches provider-specific errors that validate misses:
tflint --init # download provider-specific rulesets
tflint --recursive # lint all modules# .tflint.hcl
plugin "aws" {
enabled = true
version = "0.30.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
rule "terraform_documented_variables" {
enabled = true
}Catches: invalid instance types (t3.superxlarge does not exist), deprecated resource arguments, naming convention violations, variables without descriptions.
checkov#
Scans for security misconfigurations and compliance issues:
checkov -d . --framework terraformCatches: unencrypted S3 buckets, public security groups, missing logging, databases without backups, KMS keys without rotation. Checkov has 2,500+ built-in policies covering CIS benchmarks, SOC2, PCI-DSS, and HIPAA.
terraform fmt#
Not a test per se, but enforces consistent formatting:
terraform fmt -check -recursive -diffRun this first in CI. If formatting fails, the PR has style issues that should be fixed before deeper analysis.
Static Analysis Pipeline#
#!/bin/bash
# pre-commit or CI script
set -e
echo "=== Format check ==="
terraform fmt -check -recursive -diff
echo "=== Validate ==="
terraform init -backend=false
terraform validate
echo "=== tflint ==="
tflint --init
tflint --recursive
echo "=== Checkov ==="
checkov -d . --framework terraform --quiet
echo "=== All static checks passed ==="Total runtime: 5-30 seconds. No cloud credentials needed. No API calls.
Level 2: Cost Estimation (1-2 Minutes)#
Cost estimation runs terraform plan and analyzes the planned resources against pricing data. It catches budget surprises before they reach production.
Infracost#
# Generate plan
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# Estimate cost
infracost breakdown --path=plan.json --format=json --out-file=cost.json
infracost output --path=cost.json --format=tableOutput example:
Project: infrastructure/compute
Name Monthly Qty Unit Monthly Cost
─────────────────────────────────────────────────────────────────────────
aws_instance.app
├─ Instance usage (t3.large) 730 hours $60.74
├─ root_block_device
│ └─ Storage (gp3, 50 GB) 50 GB $4.00
└─ ebs_block_device[0]
└─ Storage (gp3, 200 GB) 200 GB $16.00
aws_rds_cluster.main
├─ Aurora capacity units 730 ACU-hours $87.60
└─ Storage 50 GB $5.00
OVERALL TOTAL $173.34Cost Guardrails#
Add policy checks for cost:
# Fail if monthly cost exceeds threshold
COST=$(jq '.totalMonthlyCost | tonumber' cost.json)
THRESHOLD=500
if (( $(echo "$COST > $THRESHOLD" | bc -l) )); then
echo "ERROR: Estimated monthly cost \$$COST exceeds threshold \$$THRESHOLD"
exit 1
fiWhat Cost Estimation Catches#
| Issue | Example | Without Cost Check |
|---|---|---|
| Oversized instances | r5.4xlarge instead of t3.large |
Discovered on first bill |
| Missing spot/reserved pricing | On-demand for always-on workloads | Overpaying by 40-70% |
| Storage accumulation | 500GB EBS per instance × 20 instances | $800/mo in EBS alone |
| NAT gateway surprise | NAT per AZ + high throughput | $100-500/mo unplanned |
| Data transfer | Cross-region replication, internet egress | Largest surprise cost |
Level 3: Plan-Based Testing (1-3 Minutes)#
Plan-based testing runs terraform plan, converts the output to JSON, and evaluates it against policy rules. The plan is never applied — no resources are created.
Conftest with OPA#
# Generate plan JSON
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# Test against policies
conftest test plan.json --policy policies/Policy examples:
# policies/tags.rego
package main
deny[msg] {
resource := input.resource_changes[_]
actions := resource.change.actions
actions[_] == "create"
# Check for required tags
tags := resource.change.after.tags
not tags.Environment
msg := sprintf("Resource %s missing 'Environment' tag", [resource.address])
}
deny[msg] {
resource := input.resource_changes[_]
actions := resource.change.actions
actions[_] == "create"
tags := resource.change.after.tags
not tags.ManagedBy
msg := sprintf("Resource %s missing 'ManagedBy' tag", [resource.address])
}# policies/security.rego
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_security_group_rule"
resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
resource.change.after.type == "ingress"
resource.change.after.from_port != 443
resource.change.after.from_port != 80
msg := sprintf(
"Security group rule %s allows 0.0.0.0/0 on port %d (only 80 and 443 allowed)",
[resource.address, resource.change.after.from_port]
)
}# policies/cost.rego
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_instance"
instance_type := resource.change.after.instance_type
expensive := {"r5.4xlarge", "r5.8xlarge", "m5.8xlarge", "c5.9xlarge"}
expensive[instance_type]
msg := sprintf(
"Instance %s uses expensive type %s — requires approval",
[resource.address, instance_type]
)
}What Plan-Based Testing Catches#
| Category | Examples |
|---|---|
| Missing tags | Resources created without required tags |
| Security violations | Open security groups, unencrypted resources, public access |
| Naming violations | Resources not matching naming conventions |
| Size constraints | Instances larger than approved sizes |
| Destructive changes | Resources being replaced or destroyed (flag for review) |
| Drift-related changes | Resources changing that were not in the code diff |
Level 4: Integration Testing (10-30 Minutes)#
Integration testing creates real infrastructure, validates it works, and tears it down. This is expensive in time and money — reserve it for nightly runs, pre-release validation, or module certification.
Terratest#
package test
import (
"testing"
"fmt"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/stretchr/testify/assert"
)
func TestNetworkingModule(t *testing.T) {
t.Parallel()
opts := &terraform.Options{
TerraformDir: "../infrastructure/networking",
Vars: map[string]interface{}{
"environment": "test",
"vpc_cidr": "10.99.0.0/16",
},
}
defer terraform.Destroy(t, opts)
terraform.InitAndApply(t, opts)
// Verify VPC was created
vpcId := terraform.Output(t, opts, "vpc_id")
assert.Contains(t, vpcId, "vpc-")
// Verify subnets are in the correct VPC
subnetIds := terraform.OutputList(t, opts, "private_subnet_ids")
assert.Equal(t, 2, len(subnetIds))
for _, subnetId := range subnetIds {
subnet := aws.GetSubnet(t, subnetId, "us-east-1")
assert.Equal(t, vpcId, subnet.VpcId)
}
// Verify DNS resolution works
vpc := aws.GetVpcById(t, vpcId, "us-east-1")
assert.True(t, vpc.EnableDnsHostnames)
}When to Run Integration Tests#
| Trigger | What to Test | Why |
|---|---|---|
| Nightly scheduled run | All modules | Catch provider API changes, drift in AMI IDs, expired certificates |
| Before tagging a module release | The module being released | Verify it works against real APIs before consumers adopt it |
| After a major provider upgrade | All modules using that provider | Verify compatibility with new API behaviors |
| After a significant refactoring | The refactored module | Verify the refactoring did not break functionality |
Integration Test Cost Management#
- Run in a dedicated test account with billing alerts
- Use the smallest viable resource sizes (
t3.micro,db.t3.micro) - Set aggressive timeouts:
defer terraform.Destroy()ensures cleanup even on failure - Tag all test resources with
Environment = "test"and a TTL tag - Run a nightly sweeper that destroys any resources older than 24 hours in the test account
Choosing What to Test Where#
| What You Want to Verify | Test Level | Tool | Cost |
|---|---|---|---|
| Valid HCL syntax | Static | terraform validate |
Free, instant |
| Provider-specific config errors | Static | tflint |
Free, instant |
| Security misconfigurations | Static | checkov |
Free, instant |
| Required tags present | Plan-based | conftest |
Free, 1-3 min |
| No open security groups | Plan-based | conftest |
Free, 1-3 min |
| No accidental destroys | Plan-based | conftest |
Free, 1-3 min |
| Monthly cost within budget | Plan-based | infracost |
Free tier, 1-2 min |
| Resources actually work | Integration | terratest |
Cloud costs, 10-30 min |
| Cross-resource connectivity | Integration | terratest |
Cloud costs, 10-30 min |
| Module output contracts | Integration | terratest |
Cloud costs, 10-30 min |
The 80/20 rule: Static analysis and plan-based testing catch 80% of issues at 1% of the cost. Integration testing catches the remaining 20% at 99% of the cost. Invest heavily in levels 1-3 before spending on level 4.
The Agent Testing Workflow#
When an agent writes or modifies Terraform:
1. Write the changes
2. Run: terraform fmt (fix formatting)
3. Run: terraform validate (catch syntax errors)
4. Run: tflint (catch provider-specific issues)
5. Run: checkov (catch security issues)
─── Fix any errors found in steps 2-5 ───
6. Run: terraform plan -out=tfplan
7. Run: conftest test (policy checks on plan)
8. Run: infracost breakdown (cost estimate)
9. Present plan summary + cost estimate to human
10. WAIT for approval
11. On approval: terraform apply tfplanSteps 2-5 are automated and self-correcting — the agent fixes issues it finds. Steps 6-8 produce information for the human. Step 9 is the safety gate. Steps 2-8 together take 2-5 minutes and catch the vast majority of issues before a human ever sees the plan.