Terraform Networking Patterns#
Networking is the first thing you build and the last thing you want to change. CIDR ranges, subnet allocation, and connectivity topology are difficult to modify after resources depend on them. Getting the network right in Terraform saves months of migration work later.
This article covers the networking patterns across AWS, Azure, and GCP — from basic VPC design to multi-region hub-spoke topologies.
CIDR Planning#
Plan CIDR ranges before writing any Terraform. Once a VPC is created with a CIDR block, changing it requires recreating the VPC and everything in it.
Allocation Strategy#
10.0.0.0/8 — total private space (16 million IPs)
Divide by region:
10.0.0.0/12 — us-east-1 (1M IPs)
10.16.0.0/12 — us-west-2 (1M IPs)
10.32.0.0/12 — eu-west-1 (1M IPs)
Divide by environment within region:
10.0.0.0/16 — us-east-1 production (65K IPs)
10.1.0.0/16 — us-east-1 staging (65K IPs)
10.2.0.0/16 — us-east-1 dev (65K IPs)
Divide by subnet within VPC:
10.0.0.0/24 — public-a (256 IPs)
10.0.1.0/24 — public-b (256 IPs)
10.0.2.0/24 — public-c (256 IPs)
10.0.10.0/24 — private-a (256 IPs)
10.0.11.0/24 — private-b (256 IPs)
10.0.12.0/24 — private-c (256 IPs)
10.0.20.0/24 — database-a (256 IPs)
10.0.21.0/24 — database-b (256 IPs)
10.0.22.0/24 — database-c (256 IPs)Key rules:
- No CIDR overlap between any two VPCs that might ever be peered
- Leave gaps between allocations for future expansion
- Document the allocation table and commit it to the repo
CIDR Sizing Guide#
| Subnet Purpose | Recommended Size | IPs Available | Notes |
|---|---|---|---|
| Public (load balancers, NAT) | /24 | 251 | Small — few resources need public IPs |
| Private (application workloads) | /20-/22 | 4091-1019 | Size for max pod/instance count |
| Database | /24 | 251 | Small — few database instances |
| GKE/EKS pod range | /16 | 65531 | Large — every pod gets an IP |
| GKE/EKS service range | /20 | 4091 | Medium — one IP per K8s service |
AWS VPC Pattern#
data "aws_availability_zones" "available" {
state = "available"
}
locals {
azs = slice(data.aws_availability_zones.available.names, 0, 3)
}
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr # e.g., "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = { Name = "${var.environment}-vpc" }
}
# Public subnets — one per AZ
resource "aws_subnet" "public" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index) # /24 per AZ
availability_zone = local.azs[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.environment}-public-${local.azs[count.index]}"
"kubernetes.io/role/elb" = "1" # required for EKS ALB
}
}
# Private subnets — one per AZ
resource "aws_subnet" "private" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 10) # /24 starting at .10
availability_zone = local.azs[count.index]
tags = {
Name = "${var.environment}-private-${local.azs[count.index]}"
"kubernetes.io/role/internal-elb" = "1" # required for EKS internal ALB
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = { Name = "${var.environment}-igw" }
}
# Public route table
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = { Name = "${var.environment}-public-rt" }
}
resource "aws_route_table_association" "public" {
count = length(local.azs)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# NAT Gateway (one for dev, per-AZ for prod)
resource "aws_eip" "nat" {
count = var.nat_gateway_count # 1 for dev, 3 for prod
domain = "vpc"
tags = { Name = "${var.environment}-nat-eip-${count.index}" }
}
resource "aws_nat_gateway" "main" {
count = var.nat_gateway_count
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = { Name = "${var.environment}-nat-${count.index}" }
depends_on = [aws_internet_gateway.main]
}
# Private route tables — one per AZ pointing to the appropriate NAT GW
resource "aws_route_table" "private" {
count = length(local.azs)
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index % var.nat_gateway_count].id
}
tags = { Name = "${var.environment}-private-rt-${count.index}" }
}
resource "aws_route_table_association" "private" {
count = length(local.azs)
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}Gotcha: cidrsubnet(var.vpc_cidr, 8, count.index) computes subnets automatically. For a /16 VPC, cidrsubnet("10.0.0.0/16", 8, 0) = 10.0.0.0/24, cidrsubnet("10.0.0.0/16", 8, 10) = 10.0.10.0/24.
Gotcha: The EKS-specific subnet tags (kubernetes.io/role/elb and kubernetes.io/role/internal-elb) are required for the AWS Load Balancer Controller to discover subnets. Missing these tags means ALBs fail silently.
Azure VNET Pattern#
resource "azurerm_virtual_network" "main" {
name = "${var.environment}-vnet"
resource_group_name = azurerm_resource_group.networking.name
location = azurerm_resource_group.networking.location
address_space = [var.vnet_cidr] # e.g., ["10.0.0.0/16"]
}
resource "azurerm_subnet" "app" {
name = "app-subnet"
resource_group_name = azurerm_resource_group.networking.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = [cidrsubnet(var.vnet_cidr, 8, 0)] # 10.0.0.0/24
}
resource "azurerm_subnet" "aks" {
name = "aks-subnet"
resource_group_name = azurerm_resource_group.networking.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = [cidrsubnet(var.vnet_cidr, 4, 1)] # 10.0.16.0/20 — large for AKS
}
resource "azurerm_subnet" "database" {
name = "database-subnet"
resource_group_name = azurerm_resource_group.networking.name
virtual_network_name = azurerm_virtual_network.main.name
address_prefixes = [cidrsubnet(var.vnet_cidr, 8, 20)] # 10.0.20.0/24
delegation {
name = "postgresql"
service_delegation {
name = "Microsoft.DBforPostgreSQL/flexibleServers"
actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
}
}
}
# NAT Gateway
resource "azurerm_public_ip" "nat" {
name = "${var.environment}-nat-pip"
resource_group_name = azurerm_resource_group.networking.name
location = azurerm_resource_group.networking.location
allocation_method = "Static"
sku = "Standard"
}
resource "azurerm_nat_gateway" "main" {
name = "${var.environment}-nat"
resource_group_name = azurerm_resource_group.networking.name
location = azurerm_resource_group.networking.location
sku_name = "Standard"
}
resource "azurerm_nat_gateway_public_ip_association" "main" {
nat_gateway_id = azurerm_nat_gateway.main.id
public_ip_address_id = azurerm_public_ip.nat.id
}
resource "azurerm_subnet_nat_gateway_association" "app" {
subnet_id = azurerm_subnet.app.id
nat_gateway_id = azurerm_nat_gateway.main.id
}Gotcha: Azure AKS with Azure CNI requires a large subnet because each pod gets a VNET IP. Calculate: max_pods_per_node × max_nodes. A /20 gives 4091 IPs — enough for ~36 nodes at 110 pods each.
Gotcha: Delegated subnets (for PostgreSQL, App Service, etc.) cannot host any other resource type. Plan your subnet allocation to account for delegations.
GCP VPC Pattern#
resource "google_compute_network" "main" {
name = "${var.environment}-vpc"
project = var.project_id
auto_create_subnetworks = false
}
resource "google_compute_subnetwork" "app" {
name = "${var.environment}-app"
project = var.project_id
region = var.region
network = google_compute_network.main.id
ip_cidr_range = cidrsubnet(var.vpc_cidr, 8, 0) # 10.0.0.0/24
private_ip_google_access = true
# GKE requires secondary ranges for pods and services
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.1.0.0/16" # 65K pod IPs
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.2.0.0/20" # 4K service IPs
}
}
# Cloud NAT
resource "google_compute_router" "main" {
name = "${var.environment}-router"
project = var.project_id
region = var.region
network = google_compute_network.main.id
}
resource "google_compute_router_nat" "main" {
name = "${var.environment}-nat"
project = var.project_id
region = var.region
router = google_compute_router.main.name
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
}Gotcha: auto_create_subnetworks = false is essential. The default creates a subnet in every region.
Gotcha: GKE secondary ranges must not overlap with any other CIDR in the VPC or any peered VPC.
VPC Peering#
AWS VPC Peering#
resource "aws_vpc_peering_connection" "prod_to_shared" {
vpc_id = aws_vpc.production.id
peer_vpc_id = aws_vpc.shared_services.id
auto_accept = true # only works same account/region
tags = { Name = "prod-to-shared" }
}
# Route from production to shared services
resource "aws_route" "prod_to_shared" {
route_table_id = aws_route_table.prod_private.id
destination_cidr_block = aws_vpc.shared_services.cidr_block
vpc_peering_connection_id = aws_vpc_peering_connection.prod_to_shared.id
}
# Route from shared services to production
resource "aws_route" "shared_to_prod" {
route_table_id = aws_route_table.shared_private.id
destination_cidr_block = aws_vpc.production.cidr_block
vpc_peering_connection_id = aws_vpc_peering_connection.prod_to_shared.id
}Gotcha: VPC peering is not transitive. If A peers with B and B peers with C, A cannot reach C through B. For transitive connectivity, use Transit Gateway.
Azure VNET Peering#
# Peering must be created from both sides
resource "azurerm_virtual_network_peering" "hub_to_spoke" {
name = "hub-to-spoke"
resource_group_name = azurerm_resource_group.hub.name
virtual_network_name = azurerm_virtual_network.hub.name
remote_virtual_network_id = azurerm_virtual_network.spoke.id
allow_forwarded_traffic = true
allow_gateway_transit = true
}
resource "azurerm_virtual_network_peering" "spoke_to_hub" {
name = "spoke-to-hub"
resource_group_name = azurerm_resource_group.spoke.name
virtual_network_name = azurerm_virtual_network.spoke.name
remote_virtual_network_id = azurerm_virtual_network.hub.id
allow_forwarded_traffic = true
use_remote_gateways = true
}Transit Gateway / Hub-Spoke#
AWS Transit Gateway#
For connecting many VPCs with transitive routing:
resource "aws_ec2_transit_gateway" "main" {
description = "Central transit gateway"
default_route_table_association = "enable"
default_route_table_propagation = "enable"
dns_support = "enable"
tags = { Name = "central-tgw" }
}
# Attach each VPC
resource "aws_ec2_transit_gateway_vpc_attachment" "production" {
transit_gateway_id = aws_ec2_transit_gateway.main.id
vpc_id = aws_vpc.production.id
subnet_ids = aws_subnet.prod_private[*].id
tags = { Name = "production-attachment" }
}
resource "aws_ec2_transit_gateway_vpc_attachment" "shared" {
transit_gateway_id = aws_ec2_transit_gateway.main.id
vpc_id = aws_vpc.shared_services.id
subnet_ids = aws_subnet.shared_private[*].id
tags = { Name = "shared-attachment" }
}
# Routes in each VPC pointing to TGW for cross-VPC traffic
resource "aws_route" "prod_to_tgw" {
route_table_id = aws_route_table.prod_private.id
destination_cidr_block = "10.0.0.0/8" # all private traffic via TGW
transit_gateway_id = aws_ec2_transit_gateway.main.id
}When to use Transit Gateway vs Peering:
- 1-3 VPCs: peering is simpler and cheaper
- 4+ VPCs: Transit Gateway scales better (N attachments vs N*(N-1)/2 peering connections)
- Need transitive routing: Transit Gateway is required
DNS Patterns#
AWS Route53 Private Hosted Zone#
resource "aws_route53_zone" "internal" {
name = "internal.example.com"
vpc {
vpc_id = aws_vpc.main.id
}
}
# Service discovery via DNS
resource "aws_route53_record" "database" {
zone_id = aws_route53_zone.internal.zone_id
name = "db.internal.example.com"
type = "CNAME"
ttl = 60
records = [aws_db_instance.main.address]
}Azure Private DNS Zone#
resource "azurerm_private_dns_zone" "internal" {
name = "internal.example.com"
resource_group_name = azurerm_resource_group.networking.name
}
resource "azurerm_private_dns_zone_virtual_network_link" "main" {
name = "internal-link"
resource_group_name = azurerm_resource_group.networking.name
private_dns_zone_name = azurerm_private_dns_zone.internal.name
virtual_network_id = azurerm_virtual_network.main.id
}Networking Gotchas Summary#
| Gotcha | Cloud | Impact | Fix |
|---|---|---|---|
| Overlapping CIDRs | All | Cannot peer VPCs | Plan CIDR allocation before creating any VPC |
| Missing NAT Gateway | All | Private subnets cannot reach internet | Add NAT for each VPC with private workloads |
| No DNS hostnames on VPC | AWS | Instances do not get DNS names | enable_dns_hostnames = true |
| Default security group open | AWS | All VPC members can communicate | Import and tighten, or explicitly manage |
| Missing subnet tags for EKS | AWS | ALB controller cannot find subnets | Add kubernetes.io/role/elb tags |
| AKS subnet too small | Azure | Pods fail to schedule (no IPs) | Size for max_pods × max_nodes |
| auto_create_subnetworks | GCP | Unwanted subnets in every region | auto_create_subnetworks = false |
| Peering not bidirectional | Azure | Peering shows as Initiated | Create peering from both sides |
| Peering not transitive | AWS | VPC A cannot reach VPC C via B | Use Transit Gateway for transitive routing |
| Secondary ranges overlap | GCP | GKE fails to create | Ensure pod/service ranges do not overlap |