Terraform Cloud Architecture Patterns#

The three-tier architecture — networking, managed Kubernetes, managed database — is the most common pattern for production deployments on any major cloud. The concepts are identical across AWS, Azure, and GCP. The Terraform code is not. Resource names differ, required arguments differ, default behaviors differ, and the gotchas that catch agents and humans are cloud-specific.

This article shows the real Terraform for each layer on each cloud, side by side, so agents can write correct infrastructure code for whichever cloud the user deploys to.

Concept Mapping#

Concept AWS Azure GCP
Virtual network VPC Virtual Network (VNET) VPC Network
Private subnet Subnet + route table Subnet (within VNET) Subnetwork
Internet gateway Internet Gateway (implicit in VNET) (implicit in VPC)
NAT for private subnets NAT Gateway + EIP NAT Gateway Cloud NAT + Router
Managed Kubernetes EKS AKS GKE
Node pools Managed Node Groups Agent Pools (default + user) Node Pools
Managed database RDS / Aurora Azure Database for PostgreSQL Flexible Cloud SQL
IAM for workloads IRSA (IAM Roles for Service Accounts) Azure AD Workload Identity GKE Workload Identity
DNS Route 53 Azure DNS Cloud DNS
Object storage S3 Blob Storage Cloud Storage
Load balancer ALB / NLB Azure Load Balancer GCP Load Balancer

Layer 1: Networking#

AWS VPC#

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags                 = { Name = "production-vpc" }
}

resource "aws_subnet" "private_a" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"
  tags              = { Name = "private-us-east-1a", "kubernetes.io/role/internal-elb" = "1" }
}

resource "aws_subnet" "private_b" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.2.0/24"
  availability_zone = "us-east-1b"
  tags              = { Name = "private-us-east-1b", "kubernetes.io/role/internal-elb" = "1" }
}

resource "aws_subnet" "public_a" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.101.0/24"
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = true
  tags                    = { Name = "public-us-east-1a", "kubernetes.io/role/elb" = "1" }
}

resource "aws_subnet" "public_b" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.102.0/24"
  availability_zone       = "us-east-1b"
  map_public_ip_on_launch = true
  tags                    = { Name = "public-us-east-1b", "kubernetes.io/role/elb" = "1" }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

resource "aws_eip" "nat" {
  domain = "vpc"
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public_a.id
}

resource "aws_route_table" "private" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.main.id
  }
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.main.id
  }
}

resource "aws_route_table_association" "private_a" {
  subnet_id      = aws_subnet.private_a.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "private_b" {
  subnet_id      = aws_subnet.private_b.id
  route_table_id = aws_route_table.private.id
}

resource "aws_route_table_association" "public_a" {
  subnet_id      = aws_subnet.public_a.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "public_b" {
  subnet_id      = aws_subnet.public_b.id
  route_table_id = aws_route_table.public.id
}

AWS gotchas:

  • Subnets need kubernetes.io/role/elb and kubernetes.io/role/internal-elb tags for EKS load balancer discovery
  • NAT Gateway costs ~$32/mo per AZ even with zero traffic — use one for dev, per-AZ for prod
  • VPC has a default security group that allows all outbound — tighten it

Azure VNET#

resource "azurerm_resource_group" "main" {
  name     = "production-rg"
  location = "eastus"
}

resource "azurerm_virtual_network" "main" {
  name                = "production-vnet"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  address_space       = ["10.0.0.0/16"]
}

resource "azurerm_subnet" "aks" {
  name                 = "aks-subnet"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.0.1.0/24"]
}

resource "azurerm_subnet" "database" {
  name                 = "database-subnet"
  resource_group_name  = azurerm_resource_group.main.name
  virtual_network_name = azurerm_virtual_network.main.name
  address_prefixes     = ["10.0.2.0/24"]

  delegation {
    name = "postgresql-delegation"
    service_delegation {
      name    = "Microsoft.DBforPostgreSQL/flexibleServers"
      actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
    }
  }
}

resource "azurerm_nat_gateway" "main" {
  name                = "production-natgw"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  sku_name            = "Standard"
}

resource "azurerm_public_ip" "nat" {
  name                = "nat-public-ip"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  allocation_method   = "Static"
  sku                 = "Standard"
}

resource "azurerm_nat_gateway_public_ip_association" "main" {
  nat_gateway_id       = azurerm_nat_gateway.main.id
  public_ip_address_id = azurerm_public_ip.nat.id
}

resource "azurerm_subnet_nat_gateway_association" "aks" {
  subnet_id      = azurerm_subnet.aks.id
  nat_gateway_id = azurerm_nat_gateway.main.id
}

Azure gotchas:

  • Everything requires a Resource Group — create it first
  • Subnets for Azure Database for PostgreSQL Flexible Server need a delegation block
  • AKS with Azure CNI requires a subnet large enough for pods (each pod gets a VNET IP) — /24 is only 251 pods
  • No explicit Internet Gateway — VNETs have implicit internet access via Azure backbone

GCP VPC#

resource "google_compute_network" "main" {
  name                    = "production-vpc"
  auto_create_subnetworks = false  # custom mode — we define our own subnets
}

resource "google_compute_subnetwork" "gke" {
  name          = "gke-subnet"
  region        = "us-central1"
  network       = google_compute_network.main.id
  ip_cidr_range = "10.0.1.0/24"

  secondary_ip_range {
    range_name    = "pods"
    ip_cidr_range = "10.1.0.0/16"  # GKE pods get secondary range
  }

  secondary_ip_range {
    range_name    = "services"
    ip_cidr_range = "10.2.0.0/20"  # GKE services get secondary range
  }
}

resource "google_compute_subnetwork" "database" {
  name          = "database-subnet"
  region        = "us-central1"
  network       = google_compute_network.main.id
  ip_cidr_range = "10.0.2.0/24"
}

resource "google_compute_router" "main" {
  name    = "production-router"
  region  = "us-central1"
  network = google_compute_network.main.id
}

resource "google_compute_router_nat" "main" {
  name                               = "production-nat"
  router                             = google_compute_router.main.name
  region                             = "us-central1"
  nat_ip_allocate_option             = "AUTO_ONLY"
  source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
}

GCP gotchas:

  • auto_create_subnetworks = false is required for custom networking — the default creates a subnet per region which you probably do not want
  • GKE requires secondary IP ranges on the subnet for pods and services (VPC-native mode)
  • Cloud NAT requires a Cloud Router — two resources instead of one
  • No Internet Gateway resource — internet access is implicit for instances with external IPs or via Cloud NAT

Layer 2: Managed Kubernetes#

AWS EKS#

resource "aws_iam_role" "eks_cluster" {
  name = "production-eks-cluster-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "eks.amazonaws.com" }
    }]
  })
}

resource "aws_iam_role_policy_attachment" "eks_cluster" {
  role       = aws_iam_role.eks_cluster.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
}

resource "aws_eks_cluster" "main" {
  name     = "production"
  role_arn = aws_iam_role.eks_cluster.arn
  version  = "1.29"

  vpc_config {
    subnet_ids              = [aws_subnet.private_a.id, aws_subnet.private_b.id]
    endpoint_private_access = true
    endpoint_public_access  = true
  }

  depends_on = [aws_iam_role_policy_attachment.eks_cluster]
}

resource "aws_iam_role" "eks_nodes" {
  name = "production-eks-node-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "ec2.amazonaws.com" }
    }]
  })
}

resource "aws_iam_role_policy_attachment" "eks_worker" {
  role       = aws_iam_role.eks_nodes.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
}

resource "aws_iam_role_policy_attachment" "eks_cni" {
  role       = aws_iam_role.eks_nodes.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
}

resource "aws_iam_role_policy_attachment" "ecr_read" {
  role       = aws_iam_role.eks_nodes.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "production-nodes"
  node_role_arn   = aws_iam_role.eks_nodes.arn
  subnet_ids      = [aws_subnet.private_a.id, aws_subnet.private_b.id]
  instance_types  = ["t3.large"]

  scaling_config {
    desired_size = 3
    max_size     = 6
    min_size     = 2
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_worker,
    aws_iam_role_policy_attachment.eks_cni,
    aws_iam_role_policy_attachment.ecr_read,
  ]
}

EKS gotchas:

  • EKS requires explicit IAM roles for both the cluster and the node group — 2 roles, 4 policy attachments minimum
  • depends_on is required for IAM attachments or the cluster/node group creation races ahead of the policy
  • EKS cluster creation takes 10-15 minutes
  • Cluster endpoint defaults to public — set endpoint_public_access = false for private clusters

Azure AKS#

resource "azurerm_kubernetes_cluster" "main" {
  name                = "production-aks"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  dns_prefix          = "production"
  kubernetes_version  = "1.29"

  default_node_pool {
    name                = "default"
    node_count          = 3
    vm_size             = "Standard_D2s_v5"
    vnet_subnet_id      = azurerm_subnet.aks.id
    min_count           = 2
    max_count           = 6
    enable_auto_scaling = true
  }

  identity {
    type = "SystemAssigned"
  }

  network_profile {
    network_plugin    = "azure"    # Azure CNI — pods get VNET IPs
    service_cidr      = "10.3.0.0/16"
    dns_service_ip    = "10.3.0.10"
  }

  oidc_issuer_enabled       = true   # for Workload Identity
  workload_identity_enabled = true
}

AKS gotchas:

  • AKS is significantly simpler than EKS — no separate IAM roles to create; identity { type = "SystemAssigned" } handles it
  • Azure CNI gives pods real VNET IPs, which means the subnet must be large enough — plan for (max_pods_per_node × max_nodes) IPs
  • service_cidr must not overlap with the VNET address space
  • AKS creates a second resource group for node resources (MC_*) — do not delete it

GCP GKE#

resource "google_service_account" "gke_nodes" {
  account_id   = "gke-node-sa"
  display_name = "GKE Node Service Account"
}

resource "google_project_iam_member" "gke_log_writer" {
  project = var.project_id
  role    = "roles/logging.logWriter"
  member  = "serviceAccount:${google_service_account.gke_nodes.email}"
}

resource "google_project_iam_member" "gke_metric_writer" {
  project = var.project_id
  role    = "roles/monitoring.metricWriter"
  member  = "serviceAccount:${google_service_account.gke_nodes.email}"
}

resource "google_container_cluster" "main" {
  name     = "production"
  location = "us-central1"

  network    = google_compute_network.main.id
  subnetwork = google_compute_subnetwork.gke.id

  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"
    services_secondary_range_name = "services"
  }

  min_master_version = "1.29"

  # We manage node pools separately
  remove_default_node_pool = true
  initial_node_count       = 1

  workload_identity_config {
    workload_pool = "${var.project_id}.svc.id.goog"
  }
}

resource "google_container_node_pool" "main" {
  name       = "production-nodes"
  location   = "us-central1"
  cluster    = google_container_cluster.main.name
  node_count = 3

  autoscaling {
    min_node_count = 2
    max_node_count = 6
  }

  node_config {
    machine_type    = "e2-standard-2"
    service_account = google_service_account.gke_nodes.email
    oauth_scopes    = ["https://www.googleapis.com/auth/cloud-platform"]

    workload_metadata_config {
      mode = "GKE_METADATA"  # enables Workload Identity
    }
  }
}

GKE gotchas:

  • remove_default_node_pool = true + separate node pool is the recommended pattern — the default pool cannot be customized
  • initial_node_count = 1 is required even with remove_default_node_pool — it creates then immediately deletes
  • Secondary IP ranges must be pre-configured on the subnet (done in networking layer)
  • Workload Identity requires both cluster config (workload_pool) and node config (GKE_METADATA)

Layer 3: Managed Database#

AWS RDS PostgreSQL#

resource "aws_db_subnet_group" "main" {
  name       = "production-db-subnets"
  subnet_ids = [aws_subnet.private_a.id, aws_subnet.private_b.id]
}

resource "aws_security_group" "database" {
  name   = "production-database-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_eks_cluster.main.vpc_config[0].cluster_security_group_id]
  }
}

resource "aws_db_instance" "main" {
  identifier           = "production-postgres"
  engine               = "postgres"
  engine_version       = "15.4"
  instance_class       = "db.t3.medium"
  allocated_storage    = 50
  max_allocated_storage = 200
  storage_encrypted    = true

  db_name  = "appdb"
  username = "dbadmin"
  password = var.db_password  # use Secrets Manager in production

  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.database.id]
  multi_az               = true
  publicly_accessible    = false

  backup_retention_period = 7
  skip_final_snapshot     = false
  final_snapshot_identifier = "production-postgres-final"

  lifecycle {
    prevent_destroy = true
  }
}

Azure Database for PostgreSQL Flexible#

resource "azurerm_private_dns_zone" "postgres" {
  name                = "production.postgres.database.azure.com"
  resource_group_name = azurerm_resource_group.main.name
}

resource "azurerm_private_dns_zone_virtual_network_link" "postgres" {
  name                  = "postgres-vnet-link"
  resource_group_name   = azurerm_resource_group.main.name
  private_dns_zone_name = azurerm_private_dns_zone.postgres.name
  virtual_network_id    = azurerm_virtual_network.main.id
}

resource "azurerm_postgresql_flexible_server" "main" {
  name                = "production-postgres"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location

  sku_name   = "GP_Standard_D2s_v3"
  version    = "15"
  storage_mb = 65536

  delegated_subnet_id = azurerm_subnet.database.id
  private_dns_zone_id = azurerm_private_dns_zone.postgres.id

  administrator_login    = "dbadmin"
  administrator_password = var.db_password

  backup_retention_days = 7
  geo_redundant_backup_enabled = false

  lifecycle {
    prevent_destroy = true
  }

  depends_on = [azurerm_private_dns_zone_virtual_network_link.postgres]
}

resource "azurerm_postgresql_flexible_server_database" "main" {
  name      = "appdb"
  server_id = azurerm_postgresql_flexible_server.main.id
  charset   = "UTF8"
  collation = "en_US.utf8"
}

GCP Cloud SQL PostgreSQL#

resource "google_compute_global_address" "private_ip" {
  name          = "private-ip-address"
  purpose       = "VPC_PEERING"
  address_type  = "INTERNAL"
  prefix_length = 16
  network       = google_compute_network.main.id
}

resource "google_service_networking_connection" "private_vpc" {
  network                 = google_compute_network.main.id
  service                 = "servicenetworking.googleapis.com"
  reserved_peering_ranges = [google_compute_global_address.private_ip.name]
}

resource "google_sql_database_instance" "main" {
  name             = "production-postgres"
  database_version = "POSTGRES_15"
  region           = "us-central1"

  settings {
    tier              = "db-custom-2-8192"  # 2 vCPU, 8 GB RAM
    disk_size         = 50
    disk_autoresize   = true
    availability_type = "REGIONAL"  # HA with automatic failover

    ip_configuration {
      ipv4_enabled    = false
      private_network = google_compute_network.main.id
    }

    backup_configuration {
      enabled                        = true
      point_in_time_recovery_enabled = true
      start_time                     = "03:00"
    }
  }

  deletion_protection = true

  depends_on = [google_service_networking_connection.private_vpc]
}

resource "google_sql_database" "main" {
  name     = "appdb"
  instance = google_sql_database_instance.main.name
}

resource "google_sql_user" "admin" {
  name     = "dbadmin"
  instance = google_sql_database_instance.main.name
  password = var.db_password
}

Cloud-Specific Complexity Comparison#

Aspect AWS Azure GCP
Networking setup Explicit (IGW, NAT, routes, associations) Moderate (NAT + delegation) Moderate (Router + NAT, secondary ranges)
Kubernetes IAM Complex (2 roles, 4+ policies, depends_on) Simple (SystemAssigned identity) Moderate (service account, 2 role bindings)
Kubernetes creation time 10-15 min 5-10 min 5-10 min
Database private access Security group + subnet group Private DNS zone + delegation + link VPC peering + private network
Total resources for 3-tier ~25-30 ~15-20 ~18-22
Total lines of Terraform ~250-300 ~150-200 ~200-250

AWS requires the most explicit configuration (especially IAM). Azure is the most opinionated (less to configure but fewer options). GCP falls in the middle (secondary IP ranges and service networking add complexity but the rest is streamlined).

Agents: Choosing the Right Pattern#

When a user says “set up infrastructure on [cloud],” use these as starting templates. Adjust sizing, naming, and CIDR ranges to match the user’s requirements. The patterns above represent production-ready baselines — they include encryption, private networking, multi-AZ, backups, and lifecycle protection.

For dev/test environments, reduce: single-AZ, smaller instances, skip_final_snapshot = true, prevent_destroy removed. For production, keep all safety measures and add monitoring, alerting, and cost tags.