Temporal High Availability#

A single-replica Temporal deployment works for development, but any pod going down takes the workflow engine offline. This guide configures a multi-replica cluster with proper resource allocation, Elasticsearch visibility, and health monitoring.

For the single-replica setup this builds on, see Running Temporal Server on Minikube.

Why HA Matters#

ComponentWhat Breaks When It Goes Down
FrontendNo client can start, signal, query, or cancel workflows. Workers cannot poll.
HistoryRunning workflows stall. No state transitions. Timers do not fire.
MatchingTasks queue up but never dispatch. Workflows appear frozen.
WorkerInternal system workflows stop (archival, replication). Application workflows unaffected.

With multiple replicas, losing a pod triggers a brief rebalance (seconds), not an outage.

HA Architecture#

Each service runs as a separate Deployment with 3+ replicas. Frontend is stateless and load-balances trivially. History partitions workflow state into shards (default 512); when a pod dies, its shards rebalance to survivors. Matching partitions task queue dispatch similarly. Worker runs Temporal internals and needs only 2 replicas.

HA Helm Values#

# values-temporal-ha.yaml
server:
  config:
    persistence:
      default:
        driver: sql
        sql:
          driver: postgres12
          host: temporal-ha-postgresql
          port: 5432
          database: temporal
          user: postgres
          password: temporal
          maxConns: 40
      visibility:
        driver: sql
        sql:
          driver: postgres12
          host: temporal-ha-postgresql
          port: 5432
          database: temporal_visibility
          user: postgres
          password: temporal
          maxConns: 20
    numHistoryShards: 512

  frontend:
    replicaCount: 3
    resources:
      requests: { cpu: 500m, memory: 512Mi }
      limits: { cpu: "1", memory: 1Gi }
  history:
    replicaCount: 3
    resources:
      requests: { cpu: 500m, memory: 1Gi }
      limits: { cpu: "2", memory: 2Gi }
  matching:
    replicaCount: 3
    resources:
      requests: { cpu: 250m, memory: 256Mi }
      limits: { cpu: "1", memory: 512Mi }
  worker:
    replicaCount: 2
    resources:
      requests: { cpu: 250m, memory: 256Mi }
      limits: { cpu: 500m, memory: 512Mi }

cassandra: { enabled: false }
mysql: { enabled: false }
postgresql: { enabled: false }
elasticsearch: { enabled: false }
schema: { setup: { enabled: true }, update: { enabled: true } }
web:
  replicaCount: 2
  service: { type: ClusterIP, port: 8080 }
helm upgrade --install temporal temporal/temporal \
  --namespace temporal -f values-temporal-ha.yaml --timeout 600s

PostgreSQL for HA#

With 11 service replicas at maxConns: 40, Temporal opens up to 440 connections. PostgreSQL defaults to 100. Configure it with headroom:

primary:
  extendedConfiguration: |
    max_connections = 600
    shared_buffers = 512MB
    effective_cache_size = 1536MB
  resources:
    requests: { cpu: "1", memory: 2Gi }
    limits: { cpu: "2", memory: 4Gi }
  persistence:
    size: 20Gi

For high-throughput clusters, deploy PgBouncer between Temporal and PostgreSQL to pool connections. At minimum, configure automated pg_dump backups – Temporal’s PostgreSQL is the system of record for all running workflows.

Elasticsearch Visibility#

SQL-based visibility works for small deployments but struggles with complex queries. Elasticsearch provides indexed custom search attributes and fast filtering.

Enable it by updating the Temporal values:

server:
  config:
    persistence:
      visibility:
        driver: elasticsearch
        elasticsearch:
          version: v7
          url: { scheme: http, host: "temporal-elasticsearch:9200" }
          indices: { visibility: temporal_visibility_v1 }

Register custom search attributes to make workflows queryable by business fields:

temporal operator search-attribute create \
  --namespace default --name CustomerId --type Keyword

temporal operator search-attribute create \
  --namespace default --name OrderAmount --type Double

Set them from workflow code:

func OrderWorkflow(ctx workflow.Context, order Order) error {
    _ = workflow.UpsertSearchAttributes(ctx, map[string]interface{}{
        "CustomerId":  order.CustomerID,
        "OrderAmount": order.Amount,
    })
    // ... workflow logic
    return nil
}

Query with the CLI:

temporal workflow list \
  --query 'CustomerId = "cust-123" AND OrderAmount > 100.0'

Health Monitoring#

Temporal exposes Prometheus metrics on port 9090. The critical ones:

MetricMeaning
temporal_persistence_latencyDatabase response time. Spikes indicate PostgreSQL issues.
schedule_to_start_latencyTime from task creation to worker pickup. High means workers cannot keep up.
persistence_errorsDatabase errors. Any sustained increase needs investigation.
history_sizeWorkflow event count. Histories above 50K events impact performance.

Alert on these conditions:

groups:
- name: temporal
  rules:
  - alert: TemporalPersistenceLatencyHigh
    expr: histogram_quantile(0.99, rate(temporal_persistence_latency_bucket[5m])) > 1
    for: 5m
    annotations:
      summary: "Temporal persistence p99 above 1 second"
  - alert: TemporalScheduleToStartHigh
    expr: histogram_quantile(0.99, rate(schedule_to_start_latency_bucket[5m])) > 30
    for: 5m
    annotations:
      summary: "Tasks waiting 30s+ for worker pickup"

Scaling Guidelines#

Scale frontend when gRPC latency rises (stateless, simple to add). Scale history when workflow task latency grows or shard rebalancing is slow. Scale matching when schedule_to_start_latency is high but workers are idle.

The numHistoryShards is set at cluster creation and cannot be changed without data migration. Choose carefully: 512 for most production workloads, 1024 for high-throughput (>10K concurrent workflows per namespace), 128 for development.

Comparison: Standard vs HA#

DimensionStandard (Dev)HA (Production)
Service replicas1 each2-3 each
CPU total~1.5 cores~6 cores
Memory total~2 GB~10 GB
VisibilitySQL-basedElasticsearch
Pod disruption toleranceNoneLoses 1 pod per service
Recovery timeMinutes (pod restart)Seconds (shard rebalance)

Next Steps#