GCP Fundamentals for Agents

Projects and Organization#

GCP organizes resources into Projects, which sit under Folders and an Organization. A project is the fundamental unit of resource organization, billing, and API enablement. Every GCP resource belongs to exactly one project.

# Set the active project
gcloud config set project my-prod-project

# List all projects
gcloud projects list

# Create a new project
gcloud projects create staging-project-2026 \
  --name="Staging" \
  --organization=ORG_ID

# Enable required APIs (must be done per-project)
gcloud services enable compute.googleapis.com
gcloud services enable container.googleapis.com
gcloud services enable sqladmin.googleapis.com

Check which project is currently active:

gcloud config get-value project

Labels are key-value pairs attached to resources for cost allocation and filtering. They serve a similar purpose to AWS tags and Azure tags:

gcloud compute instances update web-01 \
  --update-labels env=prod,team=platform,cost-center=engineering

IAM: Identity and Access Management#

GCP IAM uses a model of members, roles, and policies. A member is an identity (user, group, service account, or domain). A role is a collection of permissions. A policy binds members to roles at a specific scope (organization, folder, project, or resource).

Service accounts are the GCP equivalent of machine identities. They authenticate workloads that run without a human user. Prefer service accounts with the minimum permissions needed.

# Create a service account
gcloud iam service-accounts create deploy-agent \
  --display-name="Deployment Agent"

# Grant a role to the service account on the project
gcloud projects add-iam-policy-binding my-prod-project \
  --member="serviceAccount:deploy-agent@my-prod-project.iam.gserviceaccount.com" \
  --role="roles/container.developer"

# Generate a key (avoid if possible -- use workload identity instead)
gcloud iam service-accounts keys create key.json \
  --iam-account=deploy-agent@my-prod-project.iam.gserviceaccount.com

Workload Identity lets GKE pods authenticate as service accounts without key files. Always prefer this over exporting keys:

# Bind a Kubernetes service account to a GCP service account
gcloud iam service-accounts add-iam-policy-binding \
  deploy-agent@my-prod-project.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:my-prod-project.svc.id.goog[default/k8s-sa-name]"

View current IAM bindings on a project:

gcloud projects get-iam-policy my-prod-project \
  --flatten="bindings[].members" \
  --format="table(bindings.role, bindings.members)"

List predefined roles matching a pattern:

gcloud iam roles list --filter="name:roles/storage" --format="table(name,title)"

VPC Networking#

GCP VPCs are global – a single VPC spans all regions automatically. Subnets are regional. This is different from AWS and Azure where VPCs/VNets are confined to a single region.

# Create a custom VPC (auto mode creates a subnet in every region)
gcloud compute networks create prod-vpc \
  --subnet-mode=custom

# Create regional subnets
gcloud compute networks subnets create us-east-subnet \
  --network=prod-vpc \
  --region=us-east1 \
  --range=10.0.1.0/24

gcloud compute networks subnets create europe-west-subnet \
  --network=prod-vpc \
  --region=europe-west1 \
  --range=10.0.2.0/24

Firewall rules in GCP apply to the entire VPC (not per-subnet). They use target tags or service accounts for targeting.

# Allow HTTPS from the internet to instances tagged "web"
gcloud compute firewall-rules create allow-https \
  --network=prod-vpc \
  --allow=tcp:443 \
  --source-ranges=0.0.0.0/0 \
  --target-tags=web

# Allow internal communication within the VPC
gcloud compute firewall-rules create allow-internal \
  --network=prod-vpc \
  --allow=tcp,udp,icmp \
  --source-ranges=10.0.0.0/16

# List firewall rules
gcloud compute firewall-rules list --filter="network=prod-vpc" --format="table(name,direction,allowed,sourceRanges)"

Cloud NAT provides outbound internet access for instances without public IPs:

# Create a Cloud Router (required for Cloud NAT)
gcloud compute routers create prod-router \
  --network=prod-vpc \
  --region=us-east1

# Create Cloud NAT
gcloud compute routers nats create prod-nat \
  --router=prod-router \
  --region=us-east1 \
  --auto-allocate-nat-external-ips \
  --nat-all-subnet-ip-ranges

Compute Engine#

Compute Engine provides VMs. GCP’s machine type naming follows: e2-standard-4 means family e2, type standard, and 4 vCPUs. Key families: e2 for cost-efficient general purpose, n2/n2d for balanced workloads, c2/c3 for compute-optimized, m2/m3 for memory-optimized, and a2/g2 for GPU workloads. The t2a and t2d families are ARM64 (Tau) instances.

# Create an instance
gcloud compute instances create web-01 \
  --zone=us-east1-b \
  --machine-type=e2-standard-2 \
  --subnet=us-east-subnet \
  --image-project=ubuntu-os-cloud \
  --image-family=ubuntu-2204-lts \
  --boot-disk-size=50GB \
  --boot-disk-type=pd-ssd \
  --tags=web \
  --service-account=deploy-agent@my-prod-project.iam.gserviceaccount.com \
  --scopes=cloud-platform \
  --labels=env=prod,team=platform

List instances and their statuses:

gcloud compute instances list --format="table(name,zone,status,machineType,networkInterfaces[0].networkIP)"

SSH into an instance:

gcloud compute ssh web-01 --zone=us-east1-b

Find available machine types in a zone:

gcloud compute machine-types list --zones=us-east1-b \
  --filter="name:e2-standard" --format="table(name,guestCpus,memoryMb)"

GKE: Google Kubernetes Engine#

GKE is Google’s managed Kubernetes service. It offers Standard mode (you manage node pools) and Autopilot mode (Google manages nodes entirely).

# Create an Autopilot cluster (recommended for most workloads)
gcloud container clusters create-auto prod-cluster \
  --region=us-east1 \
  --network=prod-vpc \
  --subnetwork=us-east-subnet

# Or create a Standard cluster with explicit node pools
gcloud container clusters create prod-cluster \
  --region=us-east1 \
  --network=prod-vpc \
  --subnetwork=us-east-subnet \
  --num-nodes=2 \
  --machine-type=e2-standard-4 \
  --enable-autoscaling --min-nodes=2 --max-nodes=10 \
  --workload-pool=my-prod-project.svc.id.goog

# Get credentials for kubectl
gcloud container clusters get-credentials prod-cluster --region=us-east1

Add a node pool for specialized workloads:

gcloud container node-pools create gpu-pool \
  --cluster=prod-cluster \
  --region=us-east1 \
  --machine-type=n1-standard-4 \
  --accelerator=type=nvidia-tesla-t4,count=1 \
  --num-nodes=1 \
  --enable-autoscaling --min-nodes=0 --max-nodes=3

Check cluster status:

gcloud container clusters list --format="table(name,location,status,currentMasterVersion,currentNodeCount)"

Upgrade a cluster:

# Check available versions
gcloud container get-server-config --region=us-east1 --format="table(validMasterVersions)"

# Upgrade the control plane
gcloud container clusters upgrade prod-cluster --region=us-east1 --master --cluster-version=1.29.1-gke.1589020

Cloud SQL#

Cloud SQL is GCP’s managed relational database service supporting PostgreSQL, MySQL, and SQL Server.

# Create a PostgreSQL instance
gcloud sql instances create prod-postgres \
  --database-version=POSTGRES_16 \
  --tier=db-custom-4-16384 \
  --region=us-east1 \
  --availability-type=REGIONAL \
  --storage-size=100GB \
  --storage-type=SSD \
  --storage-auto-increase \
  --backup-start-time=02:00 \
  --enable-point-in-time-recovery

# Set the root password
gcloud sql users set-password postgres \
  --instance=prod-postgres \
  --password="$(openssl rand -base64 24)"

# Create a database
gcloud sql databases create app-db --instance=prod-postgres

# Connect using the Cloud SQL Auth Proxy (recommended)
gcloud sql connect prod-postgres --user=postgres --database=app-db

For applications running on GKE, use the Cloud SQL Auth Proxy as a sidecar container or the Cloud SQL connector libraries. Never expose Cloud SQL instances with public IPs in production.

# Whitelist a private IP range (when using private services access)
gcloud sql instances patch prod-postgres \
  --network=prod-vpc \
  --no-assign-ip

Create an on-demand backup:

gcloud sql backups create --instance=prod-postgres --description="pre-migration"

Cloud Storage#

Cloud Storage is GCP’s object storage. Buckets are globally unique and objects are stored with a flat namespace (though / in names creates a folder-like interface).

# Create a bucket
gcloud storage buckets create gs://myorg-data-2026 \
  --location=us-east1 \
  --uniform-bucket-level-access

# Upload a file
gcloud storage cp backup.sql.gz gs://myorg-data-2026/backups/

# Sync a directory
gcloud storage rsync ./build/ gs://myorg-data-2026/static/ --delete-unmatched-destination-objects

# List objects
gcloud storage ls gs://myorg-data-2026/backups/ --long

# Generate a signed URL for temporary access
gcloud storage sign-url gs://myorg-data-2026/backups/backup.sql.gz --duration=1h

Storage classes control cost and availability: Standard (frequent access), Nearline (monthly access), Coldline (quarterly access), and Archive (annual access). Set lifecycle rules to transition objects:

gcloud storage buckets update gs://myorg-data-2026 \
  --lifecycle-file=lifecycle.json

Where lifecycle.json contains:

{
  "rule": [
    {
      "action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
      "condition": {"age": 30, "matchesPrefix": ["backups/"]}
    },
    {
      "action": {"type": "SetStorageClass", "storageClass": "COLDLINE"},
      "condition": {"age": 90, "matchesPrefix": ["backups/"]}
    }
  ]
}

Cloud Monitoring#

Cloud Monitoring (formerly Stackdriver) collects metrics, logs, and traces from GCP resources. It integrates with Cloud Logging for log-based metrics and alerting.

# List available metric types for Compute Engine
gcloud monitoring metrics-descriptors list \
  --filter='metric.type = starts_with("compute.googleapis.com/instance/cpu")'

# Create an alerting policy from a JSON file
gcloud monitoring policies create --policy-from-file=alert-policy.json

# List active alerting policies
gcloud monitoring policies list --format="table(name,displayName,enabled)"

Query logs using Cloud Logging:

# Recent error logs
gcloud logging read 'severity>=ERROR AND resource.type="gce_instance"' \
  --limit=50 \
  --format="table(timestamp,severity,textPayload)"

# Logs from a specific GKE container
gcloud logging read 'resource.type="k8s_container" AND resource.labels.container_name="web-api" AND severity>=WARNING' \
  --limit=20

# Create a log-based metric
gcloud logging metrics create error-count \
  --description="Count of error log entries" \
  --log-filter='severity>=ERROR'

Quick Reference: Service Selection#

Need	Service	When to use
Compute (VMs)	Compute Engine	Full OS control, custom software
Containers (managed)	GKE Autopilot	K8s workloads, minimal node management
Containers (control)	GKE Standard	Custom node pools, GPUs, specific configs
Relational DB	Cloud SQL	Managed PostgreSQL, MySQL, SQL Server
Object storage	Cloud Storage	Files, backups, static assets
Identity	IAM + Service Accounts	All access control, Workload Identity
Networking	VPC (global)	Cross-region networking, firewall rules
Monitoring	Cloud Monitoring	Metrics, alerts, uptime checks
Logging	Cloud Logging	Centralized logs, log-based metrics