Agent-Oriented Terraform#
Most Terraform code is written by humans for humans. It favors abstraction, DRY principles, and deep module nesting — patterns that make sense when a human maintains a mental model of the codebase. Agents do not maintain mental models. They read code fresh each time, trace references to resolve dependencies, and reason about the full resource graph in a single context window.
The patterns that make Terraform elegant for humans make it expensive for agents. Deep module nesting multiplies the files an agent must read. Variable threading through three layers of modules hides dependencies behind indirection. Complex for_each over maps of objects creates resources that are invisible until runtime. The agent spends most of its context on navigation, not comprehension.
This article describes Terraform patterns optimized for agent authorship and maintenance — flatter, more explicit, with state decomposition that enables parallel operations.
The Problem with Human-Oriented Abstraction#
Consider a typical human-authored infrastructure setup:
# Human-oriented: one line calls 40+ resources across 6 nested modules
module "platform" {
source = "./modules/platform"
environment = var.environment
config = local.platform_config[var.environment]
}What an agent sees: one module call with an opaque config object. To understand what resources exist, the agent must:
- Read
modules/platform/main.tf— find 3 child module calls - Read
modules/platform/modules/networking/main.tf— find VPC, subnets, NAT, routes - Read
modules/platform/modules/compute/main.tf— find EKS, node groups - Read
modules/platform/modules/database/main.tf— find RDS, security groups - Trace
var.configthrough all 4 levels to understand what values each resource gets - Trace outputs back up 3 levels to see what is exposed
That is 8-12 files and 3,000-10,000 tokens of context just to understand what already exists — before making any changes. The human who wrote this can hold the mental model. The agent reconstructs it from scratch every session.
What Agents Struggle With#
| Pattern | Why It Is Hard for Agents | Token Cost |
|---|---|---|
| Deep module nesting (3+ levels) | Each level requires reading additional files, tracing variables and outputs through layers | 2,000-5,000 per level |
for_each over complex maps |
Resources are invisible until the map is mentally unrolled; the agent cannot see resource addresses without evaluating the expression | 500-2,000 per dynamic block |
| Variable threading through modules | A value flows root → module A → module B → resource; the agent must chase the chain to find where a value originates or is consumed |
1,000-3,000 per chain |
dynamic blocks with conditionals |
Generated blocks are invisible in the code; the agent must evaluate conditions to know which blocks exist | 500-1,500 per dynamic block |
local expressions with nested merge/lookup |
Intermediate values hide the final computed result; the agent cannot see what a resource actually gets without evaluating the full chain | 500-1,000 per complex local |
What Agents Handle Easily#
| Pattern | Why It Works | Token Cost |
|---|---|---|
Direct resource references (aws_vpc.main.id) |
One hop — the agent reads the resource and knows everything | 50-100 per reference |
| Explicit resource blocks (no dynamic generation) | Every resource is visible in the code; terraform state list matches what the agent reads |
100-300 per resource |
| Flat file organization (all resources in one directory) | No file navigation needed; the full picture is in 2-4 files | 500-2,000 total |
Simple for_each over a list of strings |
Resources are predictable; aws_subnet.private["us-east-1a"] is obvious from the code |
100-200 per for_each |
| Outputs defined next to the resources they expose | No tracing required; the output is next to the source | 50-100 per output |
Linear Terraform Patterns#
Linear Terraform is explicit, flat, and readable top-to-bottom. Every resource is visible. Every dependency is a direct single-hop reference. The agent reads one directory of files and sees the complete infrastructure.
Pattern 1: Flat Resource Definitions#
Instead of nesting modules three levels deep, define resources directly:
# networking.tf — all networking resources, explicit and visible
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = {
Name = "production-vpc"
Environment = "production"
ManagedBy = "terraform"
}
}
resource "aws_subnet" "private_a" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-1a"
tags = { Name = "production-private-us-east-1a" }
}
resource "aws_subnet" "private_b" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.2.0/24"
availability_zone = "us-east-1b"
tags = { Name = "production-private-us-east-1b" }
}
resource "aws_subnet" "public_a" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.101.0/24"
availability_zone = "us-east-1a"
map_public_ip_on_launch = true
tags = { Name = "production-public-us-east-1a" }
}
resource "aws_subnet" "public_b" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.102.0/24"
availability_zone = "us-east-1b"
map_public_ip_on_launch = true
tags = { Name = "production-public-us-east-1b" }
}A human looks at this and says “not DRY — you repeated the subnet block four times.” An agent looks at this and sees: four subnets, each with its explicit CIDR and AZ, each directly referencing aws_vpc.main.id. No loops to unroll. No maps to dereference. The dependency graph is visible in the code.
The cost of repetition is low when agents write and maintain the code. Changing a CIDR across all subnets is a simple find-and-replace — trivial for an agent. Adding a subnet is copying a block and changing three values — also trivial.
Pattern 2: Direct References Over Output Chains#
# Human pattern: outputs chained through modules
# To find what subnet the EKS cluster uses, trace:
# module.platform.module.networking.aws_subnet.private → output subnet_ids →
# module.platform output networking_subnet_ids → module.platform.module.compute input
# Agent pattern: direct reference, one hop
resource "aws_eks_cluster" "main" {
name = "production"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = [
aws_subnet.private_a.id,
aws_subnet.private_b.id,
]
security_group_ids = [aws_security_group.eks_cluster.id]
}
}Every reference is resource_type.name.attribute — one hop. The agent does not need to read module outputs, follow variable threading, or understand module composition. The EKS cluster uses these specific subnets and this specific security group.
Pattern 3: Simple for_each When Repetition Is Truly Uniform#
Linear does not mean never using loops. When resources are genuinely identical except for one key, for_each over a simple list is clear:
# Good: simple for_each, resources are predictable
variable "availability_zones" {
type = list(string)
default = ["us-east-1a", "us-east-1b"]
}
resource "aws_nat_gateway" "main" {
for_each = toset(var.availability_zones)
allocation_id = aws_eip.nat[each.key].id
subnet_id = aws_subnet.public[each.key].id
tags = { Name = "nat-${each.key}" }
}The resource addresses are obvious: aws_nat_gateway.main["us-east-1a"], aws_nat_gateway.main["us-east-1b"]. The agent knows exactly what resources exist without evaluating complex expressions.
# Bad: for_each over a complex map of objects
variable "services" {
type = map(object({
port = number
protocol = string
health = optional(object({ path = string, interval = number }))
scaling = optional(object({ min = number, max = number, target_cpu = number }))
}))
}
resource "aws_ecs_service" "main" {
for_each = var.services
# ... 30 lines of each.value.this and each.value.that
# With optional nested objects requiring try() and coalesce()
}This is a loop generating invisible resources whose attributes depend on a complex nested object. The agent cannot know what services exist, what ports they use, or what their scaling config is without evaluating the variable. Write the services explicitly instead.
Pattern 4: File Organization by Concern, Not Abstraction#
# Agent-oriented: flat, by concern
infrastructure/
├── networking/
│ ├── providers.tf
│ ├── backend.tf # key = "networking/terraform.tfstate"
│ ├── vpc.tf # VPC, subnets, route tables, NAT, IGW
│ ├── security-groups.tf
│ ├── outputs.tf
│ └── variables.tf
├── database/
│ ├── providers.tf
│ ├── backend.tf # key = "database/terraform.tfstate"
│ ├── rds.tf # RDS instance, parameter group, subnet group
│ ├── data.tf # remote_state for networking (read VPC/subnet IDs)
│ ├── outputs.tf
│ └── variables.tf
├── compute/
│ ├── providers.tf
│ ├── backend.tf # key = "compute/terraform.tfstate"
│ ├── eks.tf # EKS cluster, node groups, IRSA roles
│ ├── data.tf # remote_state for networking and database
│ ├── outputs.tf
│ └── variables.tf
└── application/
├── providers.tf
├── backend.tf # key = "application/terraform.tfstate"
├── helm.tf # Helm releases
├── k8s.tf # Kubernetes resources
├── data.tf # remote_state for compute
└── variables.tfEach directory is an independent root module with its own state file. An agent working on the database does not need to read networking code — it reads the data.tf file which declares exactly what values it imports from the networking state.
State Decomposition for Parallelism#
The single most impactful structural change for agent-managed infrastructure: separate state files per concern.
The Monolith State Problem#
infrastructure/
└── main.tf # 80 resources, 1 state file, 1 lockProblems:
- Single lock: Only one
planorapplycan run at a time. Two agents cannot work in parallel. Two CI pipelines block each other. - Large blast radius: Every
applycould modify any of 80 resources. A mistake affects everything. - Slow plans: Terraform refreshes all 80 resources on every plan, even if you only changed one.
- Conflict-prone: Any code change is potentially a merge conflict because everything is in one directory.
Decomposed State#
networking/ → 15 resources, own state, own lock
database/ → 10 resources, own state, own lock
compute/ → 20 resources, own state, own lock
application/ → 35 resources, own state, own lockBenefits:
- Parallel operations: Four agents (or four CI jobs) can plan and apply simultaneously
- Bounded blast radius: An
applyindatabase/cannot affect networking or compute - Fast plans: Each plan refreshes only 10-20 resources instead of 80
- Independent changes: Code changes to
database/do not conflict with changes tocompute/
Cross-State References#
Decomposed state requires explicit data sharing between root modules. Use terraform_remote_state or, better, SSM Parameter Store / data sources:
# database/data.tf — read networking outputs
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "myorg-tfstate"
key = "networking/terraform.tfstate"
region = "us-east-1"
}
}
# database/rds.tf — use networking values
resource "aws_db_subnet_group" "main" {
name = "production-db-subnets"
subnet_ids = data.terraform_remote_state.networking.outputs.private_subnet_ids
}The dependency is explicit: database depends on networking’s outputs. The agent can see exactly what crosses the boundary.
Apply Ordering#
With decomposed state, apply order matters:
networking (no dependencies)
↓
database (depends on networking) compute (depends on networking)
↓ ↓
application (depends on database + compute)Networking first. Database and compute in parallel (both depend only on networking). Application last. An agent orchestrating this applies networking, then fans out to database and compute simultaneously, then applies application.
When Modules Still Make Sense#
Linear does not mean “never use modules.” Modules are valuable when they provide genuine reuse across multiple root modules or when a community module handles edge cases you should not re-derive:
Use Modules For#
- Community modules with significant edge case handling:
terraform-aws-modules/vpc/awshandles 30+ edge cases (NAT gateway placement, IPv6, VPC endpoints) that you would otherwise miss. - Cross-project reuse: If your organization deploys the same RDS pattern across 10 projects, a shared module prevents drift between them.
- Encapsulating vendor-specific complexity: A module that wraps the 15 resources needed for an EKS cluster is easier to consume than writing those 15 resources in every project.
Do Not Use Modules For#
- DRY within a single project: If you have 3 similar subnets, write 3 resource blocks. The “savings” of a module is 10 lines of HCL. The cost is an additional directory, variable threading, and output wiring.
- Abstracting away detail: A module called
platformthat creates everything is not an abstraction — it is a hiding place. The agent cannot reason about what it creates without reading inside it. - One-level indirection “just in case”: Wrapping every resource in a module for hypothetical future reuse adds overhead with no current benefit.
Module Depth Rule#
If you use modules, limit to one level of nesting. The root module calls child modules. Child modules do not call their own child modules.
# Good: one level
root/
├── main.tf # calls module "vpc", module "eks", module "rds"
└── modules/
├── vpc/ # contains resources, no child modules
├── eks/ # contains resources, no child modules
└── rds/ # contains resources, no child modules
# Bad: three levels
root/
├── main.tf # calls module "platform"
└── modules/
└── platform/ # calls module "networking", module "compute"
├── networking/ # calls module "vpc", module "subnets"
│ ├── vpc/
│ └── subnets/
└── compute/One level means the agent reads the root module and one directory per component. Three levels means the agent reads the root module, the platform module, the networking module, the vpc module, the subnets module — five directories to understand a VPC.
The Human Role Shift#
When agents write and maintain Terraform, the human role changes from writer to reviewer:
| Task | Human-Authored Era | Agent-Authored Era |
|---|---|---|
| Writing HCL | Human writes, reviews own work | Agent writes, human reviews plan output |
| Deciding structure | Human chooses module boundaries | Human approves decomposition, agent implements |
| Applying changes | Human runs apply or approves CI |
Human reviews plan, approves apply (unchanged) |
| Debugging drift | Human investigates manually | Agent investigates, presents findings with options |
| Refactoring | Human plans and executes moves | Agent proposes moves, human approves, agent executes |
For the reviewer role, linear Terraform is strictly better. The plan output maps directly to visible resources in the code. No “module.platform will be updated” black boxes — every resource change is traceable to an explicit resource block.
Migration Path: Existing Nested Code#
You do not need to rewrite your Terraform overnight. Migrate incrementally:
- New root modules: Write new infrastructure in the linear pattern. Do not extend the existing monolith.
- State decomposition: Extract independent concerns into their own root modules using
terraform state mvandmovedblocks. - Module flattening: When modifying a deeply nested section, flatten it as part of the change. Replace the module call with explicit resources.
- Gradual: Each pull request makes one section more linear. Over weeks, the codebase shifts.
The goal is not purity — it is reducing the context cost for whoever (human or agent) works on the code next.