Multi-Account Cloud Architecture with Terraform#

Single-account cloud deployments work for learning and prototypes. Production systems need multiple accounts (AWS), subscriptions (Azure), or projects (GCP) for isolation — security boundaries, blast radius control, billing separation, and compliance requirements.

Terraform manages multi-account architectures well, but the patterns differ significantly from single-account work. Provider configuration, state isolation, cross-account references, and IAM trust relationships all need explicit design.

Why Multiple Accounts#

ReasonSingle Account ProblemMulti-Account Solution
Blast radiusMisconfigured IAM affects everythingDamage limited to one account
BillingCannot attribute costs to teamsPer-account billing and budgets
CompliancePCI data mixed with dev workloadsSeparate accounts for regulated workloads
Service limitsVPC limit of 5 per region sharedEach account has its own limits
Access controlComplex IAM policies to isolate teamsAccount boundary is the strongest isolation
TestingDev resources can affect productionImpossible for dev to touch prod resources

AWS Organizations#

Organization Structure#

Organization Root
├── Core OU
│   ├── Management Account (billing, org management)
│   ├── Security Account (GuardDuty, SecurityHub, audit logs)
│   └── Networking Account (Transit Gateway, shared VPCs)
├── Workload OU
│   ├── Production OU
│   │   ├── App-A Production Account
│   │   └── App-B Production Account
│   └── Non-Production OU
│       ├── App-A Development Account
│       └── App-A Staging Account
└── Sandbox OU
    └── Developer Sandbox Accounts

Terraform for AWS Organizations#

resource "aws_organizations_organization" "main" {
  feature_set = "ALL"

  enabled_policy_types = [
    "SERVICE_CONTROL_POLICY",
    "TAG_POLICY",
  ]
}

resource "aws_organizations_organizational_unit" "core" {
  name      = "Core"
  parent_id = aws_organizations_organization.main.roots[0].id
}

resource "aws_organizations_organizational_unit" "workloads" {
  name      = "Workloads"
  parent_id = aws_organizations_organization.main.roots[0].id
}

resource "aws_organizations_organizational_unit" "production" {
  name      = "Production"
  parent_id = aws_organizations_organizational_unit.workloads.id
}

# Create a workload account
resource "aws_organizations_account" "app_production" {
  name      = "app-a-production"
  email     = "aws+app-a-prod@example.com"
  parent_id = aws_organizations_organizational_unit.production.id
  role_name = "OrganizationAccountAccessRole"  # cross-account admin role

  lifecycle {
    prevent_destroy = true  # accounts cannot be easily recreated
  }
}

Service Control Policies (SCPs)#

SCPs set permission boundaries for entire OUs:

resource "aws_organizations_policy" "deny_root_actions" {
  name    = "deny-root-user-actions"
  content = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Sid       = "DenyRootUser"
      Effect    = "Deny"
      Action    = "*"
      Resource  = "*"
      Condition = {
        StringLike = {
          "aws:PrincipalArn" = "arn:aws:iam::*:root"
        }
      }
    }]
  })
}

resource "aws_organizations_policy" "deny_region" {
  name    = "restrict-regions"
  content = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Sid       = "DenyNonApprovedRegions"
      Effect    = "Deny"
      NotAction = [
        "iam:*", "sts:*", "organizations:*",
        "support:*", "budgets:*",
      ]
      Resource = "*"
      Condition = {
        StringNotEquals = {
          "aws:RequestedRegion" = ["us-east-1", "us-west-2", "eu-west-1"]
        }
      }
    }]
  })
}

resource "aws_organizations_policy_attachment" "deny_region_workloads" {
  policy_id = aws_organizations_policy.deny_region.id
  target_id = aws_organizations_organizational_unit.workloads.id
}

Cross-Account Provider Aliasing#

The key pattern for multi-account Terraform: use assume_role in provider blocks to operate in different accounts from a single Terraform configuration.

# Default provider — management account
provider "aws" {
  region = "us-east-1"
}

# Provider for the networking account
provider "aws" {
  alias  = "networking"
  region = "us-east-1"
  assume_role {
    role_arn = "arn:aws:iam::${aws_organizations_account.networking.id}:role/OrganizationAccountAccessRole"
  }
}

# Provider for the production account
provider "aws" {
  alias  = "production"
  region = "us-east-1"
  assume_role {
    role_arn = "arn:aws:iam::${aws_organizations_account.app_production.id}:role/OrganizationAccountAccessRole"
  }
}

# Create a VPC in the networking account
resource "aws_vpc" "shared" {
  provider   = aws.networking
  cidr_block = "10.0.0.0/16"
  tags       = { Name = "shared-vpc" }
}

# Create resources in the production account
resource "aws_iam_role" "app_role" {
  provider = aws.production
  name     = "app-execution-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "ecs-tasks.amazonaws.com" }
    }]
  })
}

Gotcha: The role assumed must exist in the target account. OrganizationAccountAccessRole is created automatically when you create an account through AWS Organizations, but it gives full admin access. Create least-privilege roles for Terraform.

Gotcha: Provider aliasing means one Terraform state file references multiple accounts. If that state is compromised, all accounts are exposed. Consider separate state files per account.

Azure Management Groups#

Hierarchy Structure#

Tenant Root Group
├── Platform
│   ├── Identity (Azure AD, DNS)
│   ├── Management (monitoring, automation)
│   └── Connectivity (hub VNETs, ExpressRoute, Firewall)
├── Landing Zones
│   ├── Production
│   │   ├── App-A-Prod Subscription
│   │   └── App-B-Prod Subscription
│   └── Non-Production
│       ├── App-A-Dev Subscription
│       └── App-A-Staging Subscription
└── Sandbox
    └── Developer Sandboxes

Terraform for Management Groups#

resource "azurerm_management_group" "platform" {
  display_name = "Platform"
}

resource "azurerm_management_group" "landing_zones" {
  display_name = "Landing Zones"
}

resource "azurerm_management_group" "production" {
  display_name               = "Production"
  parent_management_group_id = azurerm_management_group.landing_zones.id
}

# Policy assignment at the management group level
resource "azurerm_management_group_policy_assignment" "require_tags" {
  name                 = "require-cost-center-tag"
  management_group_id  = azurerm_management_group.landing_zones.id
  policy_definition_id = "/providers/Microsoft.Authorization/policyDefinitions/1e30110a-5ceb-460c-a204-c1c3969c6d62"

  parameters = jsonencode({
    tagName = { value = "CostCenter" }
  })
}

Multi-Subscription Provider Configuration#

# Default provider — platform subscription
provider "azurerm" {
  features {}
  subscription_id = var.platform_subscription_id
}

# Provider for each workload subscription
provider "azurerm" {
  alias           = "app_production"
  features {}
  subscription_id = var.app_production_subscription_id
}

provider "azurerm" {
  alias           = "app_dev"
  features {}
  subscription_id = var.app_dev_subscription_id
}

# Hub VNET in platform subscription
resource "azurerm_virtual_network" "hub" {
  provider            = azurerm
  name                = "hub-vnet"
  resource_group_name = azurerm_resource_group.connectivity.name
  location            = "eastus"
  address_space       = ["10.0.0.0/16"]
}

# Spoke VNET in production subscription
resource "azurerm_virtual_network" "spoke_prod" {
  provider            = azurerm.app_production
  name                = "app-prod-vnet"
  resource_group_name = azurerm_resource_group.prod_networking.name
  location            = "eastus"
  address_space       = ["10.1.0.0/16"]
}

# VNET peering: hub to spoke
resource "azurerm_virtual_network_peering" "hub_to_prod" {
  provider                  = azurerm
  name                      = "hub-to-app-prod"
  resource_group_name       = azurerm_resource_group.connectivity.name
  virtual_network_name      = azurerm_virtual_network.hub.name
  remote_virtual_network_id = azurerm_virtual_network.spoke_prod.id
  allow_forwarded_traffic   = true
}

Gotcha: Azure VNET peering must be created from both sides. You need a azurerm_virtual_network_peering resource in both the hub and spoke subscriptions.

GCP Organizations#

Hierarchy Structure#

Organization (example.com)
├── Folders
│   ├── Platform
│   │   ├── networking-prod (Shared VPC host)
│   │   ├── security-prod (audit logs, SCC)
│   │   └── monitoring-prod (Cloud Monitoring workspace)
│   ├── Production
│   │   ├── app-a-prod
│   │   └── app-b-prod
│   ├── Non-Production
│   │   ├── app-a-dev
│   │   └── app-a-staging
│   └── Sandbox
│       └── developer sandboxes

Terraform for GCP Organization#

resource "google_folder" "platform" {
  display_name = "Platform"
  parent       = "organizations/${var.org_id}"
}

resource "google_folder" "production" {
  display_name = "Production"
  parent       = "organizations/${var.org_id}"
}

# Create a project in the production folder
resource "google_project" "app_prod" {
  name            = "App A Production"
  project_id      = "myorg-app-a-prod"
  folder_id       = google_folder.production.name
  billing_account = var.billing_account_id

  labels = {
    environment = "production"
    team        = "app-a"
  }
}

# Enable required APIs in the new project
resource "google_project_service" "app_prod_apis" {
  for_each = toset([
    "compute.googleapis.com",
    "container.googleapis.com",
    "sqladmin.googleapis.com",
  ])

  project            = google_project.app_prod.project_id
  service            = each.value
  disable_on_destroy = false
}

Organization Policies#

# Restrict VM external IPs at the organization level
resource "google_organization_policy" "deny_external_ip" {
  org_id     = var.org_id
  constraint = "compute.vmExternalIpAccess"

  list_policy {
    deny {
      all = true
    }
  }
}

# Allow specific regions only
resource "google_organization_policy" "allowed_locations" {
  org_id     = var.org_id
  constraint = "gcp.resourceLocations"

  list_policy {
    allow {
      values = ["in:us-locations", "in:eu-locations"]
    }
  }
}

Shared VPC Pattern#

GCP’s Shared VPC lets a host project own the network and service projects use it:

# Host project owns the VPC
resource "google_compute_shared_vpc_host_project" "host" {
  project = google_project.networking.project_id
}

# Service project uses the shared VPC
resource "google_compute_shared_vpc_service_project" "app_prod" {
  host_project    = google_project.networking.project_id
  service_project = google_project.app_prod.project_id

  depends_on = [google_compute_shared_vpc_host_project.host]
}

State Isolation Strategy#

One State File Per Account#

The safest pattern: each account/subscription/project has its own Terraform root module and state file.

terraform/
├── organization/          # org structure, SCPs, policies
│   ├── main.tf
│   └── backend.tf         # state: s3://tf-state/organization/
├── platform/
│   ├── networking/        # shared VPCs, Transit Gateway
│   │   └── backend.tf     # state: s3://tf-state/platform/networking/
│   └── security/          # GuardDuty, SecurityHub
│       └── backend.tf     # state: s3://tf-state/platform/security/
├── app-a/
│   ├── production/        # app-a prod account resources
│   │   └── backend.tf     # state: s3://tf-state/app-a/production/
│   └── development/
│       └── backend.tf     # state: s3://tf-state/app-a/development/

Advantages:

  • Compromising one state file does not expose other accounts
  • State lock contention is per-account (no blocking between teams)
  • Each team can apply independently

Cross-account references use terraform_remote_state:

# In app-a/production/main.tf — read networking outputs
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "tf-state"
    key    = "platform/networking/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
  # ...
}

Single State with Provider Aliases (Small Scale)#

For small organizations (2-3 accounts), a single Terraform config with provider aliases is simpler:

# All accounts in one config — simpler but less isolated
provider "aws" { region = "us-east-1" }
provider "aws" { alias = "prod"; assume_role { role_arn = var.prod_role } }
provider "aws" { alias = "dev";  assume_role { role_arn = var.dev_role } }

When to use: 3 or fewer accounts, one person managing infrastructure, no compliance requirements for state isolation.

When to stop using: The moment a second team needs to apply independently, or when compliance requires separate state access controls.

Landing Zone Patterns#

A landing zone is the baseline configuration applied to every new account/subscription/project. It includes networking, IAM, logging, and security baselines.

Landing Zone Checklist#

Every new account should have:

ComponentAWSAzureGCP
NetworkingVPC with private subnetsVNET peered to hubShared VPC service project
IAM baselineBreak-glass role, CI/CD roleManaged identity for automationService account for Terraform
LoggingCloudTrail → central S3Activity Log → central Log AnalyticsAudit Log → central BigQuery
SecurityGuardDuty enabled, SecurityHubDefender for CloudSecurity Command Center
Cost controlsBudget alarm, cost allocation tagsBudget alert, resource tagsBudget alert, labels
DNSRoute53 subdomain delegationPrivate DNS zone linked to hubCloud DNS zone
EncryptionDefault EBS encryption, KMS keyCustomer-managed keyCMEK for sensitive services

Terraform Module for Landing Zone#

module "account_baseline" {
  source = "./modules/account-baseline"

  account_id   = aws_organizations_account.new_account.id
  account_name = "app-b-production"
  environment  = "production"
  vpc_cidr     = "10.2.0.0/16"

  providers = {
    aws = aws.new_account
  }
}

The module creates the VPC, IAM roles, CloudTrail, GuardDuty enablement, budget alerts, and default encryption — everything needed before the first workload deploys.

Common Gotchas#

GotchaSymptomFix
Account email reuseCannot create account — email already usedEach AWS account needs a unique email (use + aliases)
SCP blocks TerraformAccessDenied on resources that should workCheck SCPs — they override IAM policies
Cross-account assume role failsAccessDenied: User is not authorized to perform sts:AssumeRoleTrust policy on target role must allow source account/role
Provider alias forgottenResources created in wrong accountAlways specify provider = aws.alias for cross-account resources
State bucket in wrong accountState accessible to the wrong teamsPut state bucket in the management/security account
VNET peering one-sidedPeering shows Initiated not ConnectedCreate peering from both sides
GCP API not enabled in new projectAPI not enabled on first resourceAdd google_project_service for all needed APIs
Organization policy blocks resourceCryptic error about constraint violationCheck org policies at folder and org level