Terraform Core Concepts and Workflow

Providers, Resources, and Data Sources#

Terraform has three core object types. Providers are plugins that talk to APIs (AWS, Azure, GCP, Kubernetes, GitHub). Resources are the things you create and manage. Data sources read existing objects without managing them.

# providers.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
  required_version = ">= 1.5.0"
}

provider "aws" {
  region = var.region
}

# A resource Terraform creates and manages
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  tags = { Name = "main-vpc" }
}

# A data source that reads an existing AMI
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = var.instance_type
  subnet_id     = aws_vpc.main.id
}

Resources create, update, and delete. Data sources only read. If you need information about something Terraform does not manage, use a data source.

File Organization#

Terraform loads all .tf files in a directory as a single configuration. Split by purpose, not by resource:

project/
  providers.tf      # provider blocks, required_providers
  versions.tf       # terraform version constraints (can merge with providers.tf)
  variables.tf      # all input variable declarations
  outputs.tf        # all output declarations
  main.tf           # resources and data sources
  terraform.tfvars  # variable values (not committed for secrets)

For larger projects, split main.tf by logical grouping: networking.tf, compute.tf, database.tf. Terraform does not care about filenames; this is purely for human readability.

Variables: Input, Output, Local#

Input variables parameterize your configuration:

# variables.tf
variable "region" {
  type        = string
  default     = "us-east-1"
  description = "AWS region for all resources"
}

variable "instance_type" {
  type    = string
  default = "t3.micro"
}

variable "allowed_cidrs" {
  type    = list(string)
  default = ["10.0.0.0/8"]
}

variable "tags" {
  type = map(string)
  default = {
    Environment = "dev"
    ManagedBy   = "terraform"
  }
}

variable "db_config" {
  type = object({
    engine         = string
    instance_class = string
    allocated_storage = number
    multi_az       = bool
  })
}

Set values via terraform.tfvars, .auto.tfvars files (loaded automatically), -var flags, or TF_VAR_ environment variables. Precedence: env vars < terraform.tfvars < *.auto.tfvars (alphabetical) < -var flag.

Output variables expose values after apply:

output "vpc_id" {
  value       = aws_vpc.main.id
  description = "ID of the main VPC"
}

Locals compute intermediate values:

locals {
  name_prefix = "${var.project}-${var.environment}"
  common_tags = merge(var.tags, {
    Project = var.project
  })
}

The Workflow: init, plan, apply, destroy#

# Download providers and initialize backend
terraform init

# Preview changes without applying
terraform plan -out=tfplan

# Apply the saved plan (no re-prompting)
terraform apply tfplan

# Tear everything down
terraform destroy

Always save the plan file with -out and apply that exact plan. Running terraform apply without a saved plan recomputes changes, which might differ from what you reviewed.

count vs for_each#

count creates resources by index. It works but has a problem: if you remove an item from the middle of a list, every resource after it gets recreated because indices shift.

resource "aws_subnet" "public" {
  count             = length(var.public_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.public_subnet_cidrs[count.index]
  availability_zone = var.azs[count.index]
}

for_each uses map keys or set values as identifiers. Adding or removing items only affects that specific resource:

variable "subnets" {
  type = map(object({
    cidr = string
    az   = string
  }))
}

resource "aws_subnet" "public" {
  for_each          = var.subnets
  vpc_id            = aws_vpc.main.id
  cidr_block        = each.value.cidr
  availability_zone = each.value.az
  tags              = { Name = each.key }
}

Use for_each by default. Use count only for simple “create N identical things” cases or conditional creation (count = var.enabled ? 1 : 0).

Dynamic Blocks#

When a resource has a repeatable nested block, use dynamic instead of duplicating:

resource "aws_security_group" "web" {
  name   = "web-sg"
  vpc_id = aws_vpc.main.id

  dynamic "ingress" {
    for_each = var.ingress_rules
    content {
      from_port   = ingress.value.from_port
      to_port     = ingress.value.to_port
      protocol    = ingress.value.protocol
      cidr_blocks = ingress.value.cidr_blocks
    }
  }
}

Dynamic blocks reduce repetition but hurt readability. If you have more than two levels of nesting, consider restructuring your variables or breaking the resource into a module.

terraform console#

Test expressions interactively before putting them in config:

$ terraform console
> var.subnets
{
  "public-a" = { az = "us-east-1a", cidr = "10.0.1.0/24" }
  "public-b" = { az = "us-east-1b", cidr = "10.0.2.0/24" }
}
> keys(var.subnets)
["public-a", "public-b"]
> [for k, v in var.subnets : "${k}: ${v.cidr}"]
["public-a: 10.0.1.0/24", "public-b: 10.0.2.0/24"]
> cidrsubnet("10.0.0.0/16", 8, 1)
"10.0.1.0/24"

This is invaluable for debugging complex expressions, for loops, and functions like cidrsubnet, merge, lookup, and try.