Integrating Infrastructure as Code with CI/CD#

Running Terraform locally works for one person. It breaks down when multiple people (or agents) modify infrastructure concurrently, when changes need review before applying, and when environments (dev/staging/prod) need synchronized promotion. CI/CD pipelines solve this by making the plan-review-apply cycle automated, auditable, and safe.

This article covers the patterns for integrating Terraform into CI/CD — from the basic plan-on-PR flow to multi-directory monorepos with dependency ordering and environment promotion.

The Core Pattern: Plan on PR, Apply on Merge#

Developer creates PR
        ↓
CI runs terraform plan → posts plan output as PR comment
        ↓
Reviewer reads plan, approves PR
        ↓
PR merges to main
        ↓
CI runs terraform apply with the exact plan reviewed

This is the foundation. Every other pattern builds on it.

Why This Pattern Is Non-Negotiable#

  • Plan visibility: The reviewer sees exactly what will change before it changes
  • Auditability: Every infrastructure change is tied to a PR with discussion, approval, and plan output
  • Safety: apply runs the saved plan, not a re-computed one that might differ
  • Concurrency control: The state lock prevents two applies from running simultaneously
  • Rollback trail: Every change is a git commit that can be reverted

The GitHub Actions Implementation#

name: Terraform
on:
  pull_request:
    paths: ["infrastructure/**"]
  push:
    branches: [main]
    paths: ["infrastructure/**"]

permissions:
  id-token: write
  contents: read
  pull-requests: write

env:
  TF_IN_AUTOMATION: true
  TF_INPUT: false

jobs:
  plan:
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.TERRAFORM_PLAN_ROLE_ARN }}
          aws-region: us-east-1

      - name: Init
        working-directory: infrastructure
        run: terraform init -backend-config=backend.hcl

      - name: Plan
        working-directory: infrastructure
        id: plan
        run: terraform plan -no-color -out=tfplan 2>&1 | tee plan.txt
        continue-on-error: true

      - name: Comment Plan on PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const plan = fs.readFileSync('infrastructure/plan.txt', 'utf8');
            const truncated = plan.length > 60000
              ? plan.substring(0, 60000) + '\n\n... truncated ...'
              : plan;
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
            });

      - name: Fail on Plan Error
        if: steps.plan.outcome == 'failure'
        run: exit 1

  apply:
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production  # requires manual approval in GitHub
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.TERRAFORM_APPLY_ROLE_ARN }}
          aws-region: us-east-1

      - name: Init
        working-directory: infrastructure
        run: terraform init -backend-config=backend.hcl

      - name: Plan
        working-directory: infrastructure
        run: terraform plan -no-color -out=tfplan

      - name: Apply
        working-directory: infrastructure
        run: terraform apply -no-color tfplan

Key details:

  • TF_IN_AUTOMATION=true suppresses interactive prompts
  • TF_INPUT=false prevents Terraform from waiting for input on missing variables
  • Separate IAM roles for plan (read-only) and apply (write) — principle of least privilege
  • environment: production in the apply job enables GitHub’s environment protection rules (manual approval)

Multi-Directory Monorepo#

When infrastructure is decomposed into separate root modules (networking, database, compute), CI/CD must detect which directories changed and run plan/apply only for those.

Directory Detection#

jobs:
  detect-changes:
    runs-on: ubuntu-latest
    outputs:
      directories: ${{ steps.detect.outputs.directories }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - id: detect
        run: |
          DIRS=$(git diff --name-only origin/main...HEAD \
            | grep '^infrastructure/' \
            | cut -d'/' -f1-2 \
            | sort -u \
            | jq -R -s -c 'split("\n") | map(select(length > 0))')
          echo "directories=$DIRS" >> "$GITHUB_OUTPUT"

  plan:
    needs: detect-changes
    if: needs.detect-changes.outputs.directories != '[]'
    strategy:
      matrix:
        directory: ${{ fromJson(needs.detect-changes.outputs.directories) }}
      fail-fast: false
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Plan
        working-directory: ${{ matrix.directory }}
        run: |
          terraform init
          terraform plan -no-color -out=tfplan

Dependency Ordering for Apply#

Plan can run in parallel for all changed directories. Apply must respect dependencies:

infrastructure/
├── networking/    # Layer 1: no dependencies
├── database/      # Layer 2: depends on networking
├── compute/       # Layer 2: depends on networking
└── application/   # Layer 3: depends on database + compute
jobs:
  apply-layer-1:
    if: contains(needs.detect-changes.outputs.directories, 'infrastructure/networking')
    steps:
      - name: Apply Networking
        working-directory: infrastructure/networking
        run: terraform apply tfplan

  apply-layer-2:
    needs: apply-layer-1
    strategy:
      matrix:
        directory: [infrastructure/database, infrastructure/compute]
    steps:
      - name: Apply
        working-directory: ${{ matrix.directory }}
        run: terraform apply tfplan

  apply-layer-3:
    needs: apply-layer-2
    if: contains(needs.detect-changes.outputs.directories, 'infrastructure/application')
    steps:
      - name: Apply Application
        working-directory: infrastructure/application
        run: terraform apply tfplan

Layer 1 (networking) applies first. Layer 2 (database, compute) applies in parallel after Layer 1. Layer 3 (application) applies after Layer 2.

Drift Detection#

Infrastructure drift — changes made outside of Terraform — should be detected proactively, not discovered during the next apply.

Scheduled Drift Detection#

name: Drift Detection
on:
  schedule:
    - cron: '0 6 * * 1-5'  # weekdays at 6 AM UTC

jobs:
  check-drift:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        directory:
          - infrastructure/networking
          - infrastructure/database
          - infrastructure/compute
          - infrastructure/application
    steps:
      - uses: actions/checkout@v4

      - name: Init
        working-directory: ${{ matrix.directory }}
        run: terraform init

      - name: Detect Drift
        id: drift
        working-directory: ${{ matrix.directory }}
        run: |
          terraform plan -detailed-exitcode -no-color 2>&1 | tee drift.txt
          echo "exit_code=$?" >> "$GITHUB_OUTPUT"
        continue-on-error: true

      - name: Alert on Drift
        if: steps.drift.outputs.exit_code == '2'
        run: |
          echo "::warning::Drift detected in ${{ matrix.directory }}"
          # Send Slack notification, create GitHub issue, etc.

terraform plan -detailed-exitcode returns:

  • 0: No changes (no drift)
  • 1: Error
  • 2: Changes detected (drift)

What to Do When Drift Is Detected#

  1. Investigate: What changed? Check cloud audit logs (CloudTrail, Azure Activity Log, GCP Audit Log)
  2. Classify: Was the change intentional (manual hotfix, auto-scaling) or accidental (console click)?
  3. Decide:
    • If intentional: update Terraform code to match reality (terraform apply -refresh-only then adjust code)
    • If accidental: apply Terraform to revert to the desired state
    • If auto-managed: add ignore_changes for that attribute

Environment Promotion#

Moving infrastructure changes safely from dev → staging → production.

The Promotion Pattern#

                 dev/                    staging/               prod/
                  │                        │                      │
PR with change ──→│                        │                      │
                  ├── plan + apply ──→     │                      │
                  │                  OK?   │                      │
                  │                   │    ├── plan + apply ──→   │
                  │                   │    │                OK?   │
                  │                   │    │                 │    ├── plan + apply
                  │                   │    │                 │    │

Implementation: Staged Applies#

jobs:
  apply-dev:
    environment: dev
    steps:
      - working-directory: infrastructure/envs/dev
        run: terraform init && terraform plan -out=tfplan && terraform apply tfplan

  apply-staging:
    needs: apply-dev
    environment: staging  # may require manual approval
    steps:
      - working-directory: infrastructure/envs/staging
        run: terraform init && terraform plan -out=tfplan && terraform apply tfplan

  apply-prod:
    needs: apply-staging
    environment: production  # always requires manual approval
    steps:
      - working-directory: infrastructure/envs/prod
        run: terraform init && terraform plan -out=tfplan && terraform apply tfplan

Key: Each environment re-plans (not re-using the dev plan file). The code is the same, but the state and variables differ. The plan for production might show different changes than dev if the environments have diverged.

Emergency Rollback#

When an apply causes problems, you need to revert quickly.

Git Revert Pattern (Safest)#

# 1. Revert the merge commit
git revert -m 1 HEAD

# 2. Push the revert (triggers CI)
git push

# 3. CI runs plan (showing the revert changes) and apply

This is the safest rollback because it goes through the full plan-review-apply cycle. The plan shows exactly what will be reverted.

Manual Targeted Revert (Faster, Riskier)#

# 1. Check out the previous state of one directory
git checkout HEAD~1 -- infrastructure/compute/

# 2. Plan and apply locally (bypasses CI)
cd infrastructure/compute
terraform plan -out=tfplan
terraform apply tfplan

# 3. Commit the revert
git add . && git commit -m "Revert compute changes"

This is faster but bypasses CI review. Use only in genuine emergencies.

What Cannot Be Rolled Back#

Some changes are irreversible even with a git revert:

  • Database deletions (data is gone unless there is a backup)
  • Encryption key rotations (old key is disabled)
  • DNS propagation (reverting the record does not immediately undo global DNS cache)
  • S3 bucket name changes (old name is released, may be claimed by someone else)

For these, the “rollback” is a forward fix: create a new resource, restore from backup, or wait for propagation.

Platform Comparison#

Feature GitHub Actions Atlantis Spacelift Terraform Cloud
Hosting GitHub-hosted or self-hosted Self-hosted SaaS SaaS
Plan on PR Via workflow Native (atlantis plan) Native Native
Apply on merge Via workflow Via PR comment (atlantis apply) Native Native
State management You manage (S3/Azure Blob/GCS) You manage Built-in Built-in
Drift detection Custom scheduled job Not built-in Native Native
Cost estimation Via Infracost integration Via Infracost integration Native Via integration
Policy as code Via OPA/Conftest steps Via OPA/Conftest Native (OPA) Sentinel
Multi-directory Matrix strategy Native (per-directory) Native (stacks) Workspaces
Dependency ordering Manual job dependencies Custom workflows Stack dependencies Run triggers
Price Free for public repos, usage-based for private Free (self-host cost) From $0/mo (community) From $0/mo (free tier)

For small teams: GitHub Actions + manual S3 backend. Simple, free, sufficient.

For medium teams: Atlantis (if you want self-hosted control) or Spacelift (if you want managed).

For large teams: Spacelift or Terraform Cloud with full policy enforcement, drift detection, and stack dependencies.

The Complete Pipeline Checklist#

A production-ready IaC pipeline includes:

  • Format check (terraform fmt -check) — every commit
  • Validate (terraform validate) — every commit
  • Lint (tflint) — every PR
  • Security scan (checkov) — every PR
  • Plan with output posted to PR — every PR
  • Cost estimate (infracost) — every PR
  • Policy check (conftest) — every PR
  • Manual approval gate — before production apply
  • Apply from saved plan — on merge to main
  • Drift detection — scheduled (daily or weekly)
  • State backup — automated
  • Rollback procedure — documented and tested