Terraform Code Quality#

Writing Terraform that works is easy. Writing Terraform that is safe, maintainable, and comprehensible to the next person (or agent) is harder. Most quality problems are not bugs — they are patterns that work today but create pain tomorrow: hardcoded IDs that break in a new account, missing lifecycle rules that cause accidental data loss, modules that are too big to understand or too small to justify their existence.

This article provides concrete heuristics for evaluating Terraform code quality — usable by agents reviewing their own output or reviewing existing code before modification.

Variables, Locals, and Hardcoded Values#

The decision of what to parameterize and what to hardcode is a judgment call. The common mistake is parameterizing everything or parameterizing nothing.

When to Use Each#

Pattern	Use When	Example
Hardcoded value	The value is a fixed property of the resource type or the specific deployment, and changing it would require understanding the code anyway	`protocol = "tcp"`, `engine = "postgres"`
Local value	The value is computed from other values or used in multiple places within the same module	`local.name_prefix = "${var.project}-${var.environment}"`
Input variable	The value differs between environments or is a meaningful choice the caller should make	`var.instance_type`, `var.environment`, `var.vpc_cidr`

Anti-Pattern: Over-Parameterization#

# Bad: everything is a variable, even things that never change
variable "protocol" {
  type    = string
  default = "tcp"
}

variable "engine_family" {
  type    = string
  default = "postgres"
}

variable "enable_dns" {
  type    = bool
  default = true
}

# These add 20 lines of variable declarations for values that
# are never overridden by any caller. They clutter variables.tf
# and create the illusion that they are configurable when they
# are not.

# Good: hardcode what never changes
resource "aws_vpc" "main" {
  cidr_block           = var.cidr            # genuinely varies per environment
  enable_dns_hostnames = true                 # always true for our use case
  enable_dns_support   = true                 # always true for our use case
}

resource "aws_db_instance" "main" {
  engine               = "postgres"           # fixed choice, not configurable
  engine_version       = var.engine_version   # varies for upgrade testing
  instance_class       = var.instance_class   # varies per environment
}

Heuristic: If a variable has a default that no caller ever overrides, consider hardcoding it. If every environment uses the same value, it is not a variable — it is a constant.

Anti-Pattern: Under-Parameterization#

# Bad: environment-specific values hardcoded
resource "aws_instance" "app" {
  ami           = "ami-0abc123def456"  # this AMI is us-east-1 specific
  instance_type = "t3.large"           # prod might need r5.xlarge
  subnet_id     = "subnet-0abc123"     # hardcoded subnet ID — breaks in any other VPC
}

Heuristic: Any AWS resource ID (ami-, subnet-, vpc-, sg-) hardcoded in a .tf file is almost always wrong. Use data sources to look them up dynamically or pass them as variables.

Resource Naming and Tagging#

Naming Resources#

Resource names in Terraform (the label after the resource type) should be descriptive and consistent:

# Good: descriptive, intent is clear
resource "aws_subnet" "private_a" { ... }
resource "aws_subnet" "private_b" { ... }
resource "aws_subnet" "public_a" { ... }

# Bad: generic names
resource "aws_subnet" "this" { ... }    # which subnet?
resource "aws_subnet" "main" { ... }    # if there are 4 subnets, which is "main"?
resource "aws_subnet" "subnet1" { ... } # numbered names lose meaning

# Acceptable for singletons
resource "aws_vpc" "main" { ... }       # there is only one VPC, "main" is fine
resource "aws_eks_cluster" "main" { ... }

Heuristic: If a resource type appears multiple times, each instance needs a name that distinguishes it. If a resource type appears once, main or this is acceptable.

Tagging Strategy#

Every taggable resource should have a consistent set of tags:

locals {
  common_tags = {
    Environment = var.environment
    Project     = var.project
    ManagedBy   = "terraform"
    Owner       = var.owner
  }
}

resource "aws_vpc" "main" {
  cidr_block = var.cidr
  tags = merge(local.common_tags, {
    Name = "${var.project}-${var.environment}-vpc"
  })
}

Required tags (enforce with OPA or tflint):

Name — human-readable identifier
Environment — which environment this belongs to
ManagedBy — “terraform” (helps identify manually-created resources)
Project — which project or team owns this

Heuristic: If you see a resource without a ManagedBy = "terraform" tag, flag it. Without this tag, there is no way to distinguish Terraform-managed resources from manually created ones when investigating drift.

Lifecycle Rules#

Lifecycle rules prevent accidental destruction and control replacement behavior. Missing lifecycle rules on stateful resources is the most common source of data loss in Terraform.

When to Use prevent_destroy#

resource "aws_db_instance" "main" {
  # ... configuration ...

  lifecycle {
    prevent_destroy = true
  }
}

Add prevent_destroy = true to:

Databases (RDS, Aurora, DynamoDB tables with data)
S3 buckets with important data
EFS filesystems
Encryption keys (KMS)
DNS zones with external references

Heuristic: Any resource that stores data or state that cannot be recreated should have prevent_destroy = true. This forces a human to explicitly remove the lifecycle rule before destroying — it is a deliberate safety catch.

When to Use create_before_destroy#

resource "aws_security_group" "web" {
  # ... configuration ...

  lifecycle {
    create_before_destroy = true
  }
}

Use for resources that other resources depend on where downtime during replacement is unacceptable. Security groups, target groups, and launch templates are common candidates.

When to Use ignore_changes#

resource "aws_autoscaling_group" "main" {
  desired_capacity = 3  # initial value, but ASG scales this dynamically

  lifecycle {
    ignore_changes = [desired_capacity]
  }
}

Use for attributes that are managed outside of Terraform after creation — ASG scaling, ECS desired count, tags managed by AWS auto-tagging. Be cautious: ignoring changes means Terraform will never correct drift on that attribute.

Heuristic: If terraform plan repeatedly shows changes to an attribute you did not modify in code, it is a candidate for ignore_changes. But first understand why it is changing — the change might indicate a real problem.

Data Sources vs Hardcoded IDs#

# Bad: hardcoded AMI ID
resource "aws_instance" "app" {
  ami = "ami-0abc123def456"  # what is this? Is it still current? Does it exist in this region?
}

# Good: data source looks it up
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*"]
  }
}

resource "aws_instance" "app" {
  ami = data.aws_ami.ubuntu.id
}

# Bad: hardcoded account ID
resource "aws_iam_role" "app" {
  assume_role_policy = jsonencode({
    Statement = [{
      Principal = { AWS = "arn:aws:iam::123456789012:root" }
      # ↑ this breaks in any other account
    }]
  })
}

# Good: data source for current account
data "aws_caller_identity" "current" {}

resource "aws_iam_role" "app" {
  assume_role_policy = jsonencode({
    Statement = [{
      Principal = { AWS = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root" }
    }]
  })
}

Heuristic: Any literal string matching ami-*, subnet-*, vpc-*, sg-*, arn:*, or a 12-digit account ID in a .tf file should be replaced with a data source or variable.

Provider Pinning#

terraform {
  required_version = ">= 1.5.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"     # allows 5.x, prevents 6.0
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.25"    # allows 2.25+, prevents 3.0
    }
  }
}

Rules:

Always pin the Terraform version range
Always pin provider versions with ~> (pessimistic constraint)
Never use unversioned providers in shared modules
Commit the .terraform.lock.hcl file — it pins exact versions for reproducible builds

Heuristic: If required_providers is missing or has no version constraint, flag it. Unpinned providers can update automatically and break existing configurations.

Common Anti-Patterns#

The God Module#

A single module that creates everything:

module "everything" {
  source      = "./modules/platform"
  environment = var.environment
  # 50 variables passed in
}

Detection: A module with more than 30 input variables, more than 20 resources, or more than 2 levels of child modules.

Fix: Decompose into focused modules by concern (networking, compute, database) or flatten into explicit resources.

The God State File#

All resources in one root module, one state file:

$ terraform state list | wc -l
147

Detection: More than 50 resources in a single state file.

Fix: Decompose into separate root modules per concern. Use terraform state mv and moved blocks.

Output Spaghetti#

Modules output values only so they can be threaded through other modules:

# Module A outputs 15 values
# Module B consumes 3 of them and outputs 10 more
# Module C consumes 5 values from A and 3 from B
# The root module wires everything

Detection: A root module where most of the code is module.X.output_Y wiring.

Fix: Use flatter structure with direct references, or use terraform_remote_state to share between independent root modules.

The Premature Abstraction#

Wrapping a single resource in a module:

# modules/s3_bucket/main.tf — contains exactly 1 aws_s3_bucket resource
# modules/s3_bucket/variables.tf — 10 variables mirroring the resource attributes
# modules/s3_bucket/outputs.tf — outputs mirroring the resource attributes

Detection: A module directory with 1 resource that has the same inputs/outputs as the resource itself.

Fix: Use the resource directly. A module adds value when it combines multiple resources into a coherent abstraction, not when it wraps one resource.

Conditional Resource Gymnastics#

resource "aws_instance" "monitoring" {
  count = var.enable_monitoring ? 1 : 0
  # ...
}

resource "aws_security_group_rule" "monitoring_ingress" {
  count             = var.enable_monitoring ? 1 : 0
  security_group_id = var.enable_monitoring ? aws_instance.monitoring[0].vpc_security_group_ids[0] : ""
  # ↑ this ternary is necessary because the resource may not exist
}

Detection: Multiple resources using count = var.something ? 1 : 0 with ternary references to potentially-nonexistent resources.

Fix: If a group of resources is conditionally needed, put them in a separate root module or a focused module that is either included or not. Avoid scattering conditional creation across individual resources.

Code Review Heuristic Checklist#

When reviewing Terraform code (your own or existing):

No hardcoded resource IDs (AMIs, subnets, VPCs, ARNs)
Providers pinned with version constraints
Stateful resources have prevent_destroy = true
All taggable resources have consistent tags including ManagedBy = "terraform"
Variables have descriptions (not blank descriptions)
No modules wrapping single resources (premature abstraction)
Module nesting is 1 level maximum (not 3+ deep)
State file contains fewer than 50 resources (if more, consider splitting)
No count on resources that should use for_each (index-based addressing is fragile)
No sensitive values in outputs without sensitive = true
Backend configuration specifies encryption (encrypt = true for S3)
.terraform.lock.hcl committed (reproducible builds)