Docker Compose Validation Stacks#
Docker Compose validates multi-service architectures without Kubernetes overhead. It answers the question: do these services actually work together? Containers start, connect, and communicate – or they fail, giving you fast feedback before you push to a cluster.
This article provides complete Compose stacks for four common validation scenarios. Each includes the full docker-compose.yml, health check scripts, and teardown procedures. The pattern for using them is always the same: clone the template, customize for your services, bring it up, validate, capture results, bring it down.
Stack 1: Web Application + PostgreSQL + Redis#
The most common stack. A web application that uses PostgreSQL for persistence and Redis for caching or session storage. Validates database connectivity, cache behavior, and the application’s startup sequence.
# docker-compose-web-stack.yml
version: "3.8"
services:
app:
image: ${APP_IMAGE:-my-app:latest}
build:
context: .
dockerfile: Dockerfile
ports:
- "8080:8080"
environment:
DATABASE_URL: "postgresql://appuser:${POSTGRES_PASSWORD:-apppass}@postgres:5432/appdb"
REDIS_URL: "redis://redis:6379/0"
APP_ENV: "test"
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
read_only: true
tmpfs:
- /tmp
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
networks:
- frontend
- backend
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: appuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-apppass}
POSTGRES_DB: appdb
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
read_only: true
tmpfs:
- /tmp
- /var/run/postgresql
cap_drop:
- ALL
cap_add:
- CHOWN
- DAC_OVERRIDE
- FOWNER
- SETGID
- SETUID
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U appuser -d appdb"]
interval: 5s
timeout: 3s
retries: 10
networks:
- backend
redis:
image: redis:7-alpine
read_only: true
tmpfs:
- /tmp
- /data
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 256M
cpus: "0.5"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 10
networks:
- backend
networks:
frontend:
backend:
internal: true
volumes:
postgres_data:Security patterns applied: Read-only root filesystems with tmpfs for writable directories, all capabilities dropped (PostgreSQL gets the minimum it needs), no-new-privileges, resource limits on every container, backend network is internal-only (PostgreSQL and Redis have no internet access), init scripts mounted read-only, passwords sourced from environment variables. See Securing Docker Validation Templates for the full security reference.
Health check script:
#!/bin/bash
# validate-web-stack.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-web-stack.yml"
PROJECT_NAME="web-validation-$$"
cleanup() {
echo "--- Tearing down ---"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait
echo "=== Checking service health ==="
# Verify PostgreSQL
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T postgres \
psql -U appuser -d appdb -c "SELECT 1 AS health_check;"
echo "PostgreSQL: healthy"
# Verify Redis
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T redis \
redis-cli ping
echo "Redis: healthy"
# Verify application
for i in $(seq 1 30); do
if curl -sf http://localhost:8080/health > /dev/null 2>&1; then
echo "Application: healthy"
break
fi
if [ "$i" -eq 30 ]; then
echo "ERROR: Application health check failed after 30 attempts"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" logs app
exit 1
fi
sleep 2
done
# Test database connectivity through the app
HTTP_STATUS=$(curl -sf -o /dev/null -w "%{http_code}" http://localhost:8080/health)
if [ "${HTTP_STATUS}" -ne 200 ]; then
echo "ERROR: App returned HTTP ${HTTP_STATUS}"
exit 1
fi
echo "=== All checks passed ==="What this validates: Application starts, connects to PostgreSQL, connects to Redis, and responds to health checks. The depends_on with condition: service_healthy ensures the startup order is correct.
What this misses: Production-like connection pooling, SSL/TLS between services, persistent volume behavior across restarts, and performance under load.
Stack 2: Microservices (3 Services + Message Queue + Database)#
A realistic microservices topology. Three services communicate through a message queue (RabbitMQ), with a shared database for the services that need persistence. Validates inter-service communication, queue connectivity, and service independence.
# docker-compose-microservices.yml
version: "3.8"
services:
api-gateway:
image: ${API_GATEWAY_IMAGE:-api-gateway:latest}
build:
context: ./services/api-gateway
ports:
- "8080:8080"
environment:
ORDER_SERVICE_URL: "http://order-service:8081"
USER_SERVICE_URL: "http://user-service:8082"
depends_on:
order-service:
condition: service_healthy
user-service:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
order-service:
image: ${ORDER_SERVICE_IMAGE:-order-service:latest}
build:
context: ./services/order-service
ports:
- "8081:8081"
environment:
DATABASE_URL: "postgresql://orders:orderspass@postgres:5432/ordersdb"
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
QUEUE_NAME: "order-events"
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8081/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
user-service:
image: ${USER_SERVICE_IMAGE:-user-service:latest}
build:
context: ./services/user-service
ports:
- "8082:8082"
environment:
DATABASE_URL: "postgresql://users:userspass@postgres:5432/usersdb"
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
QUEUE_NAME: "user-events"
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8082/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
notification-worker:
image: ${NOTIFICATION_IMAGE:-notification-worker:latest}
build:
context: ./services/notification-worker
environment:
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
LISTEN_QUEUES: "order-events,user-events"
depends_on:
rabbitmq:
condition: service_healthy
# Workers may not have HTTP health checks -- use a different strategy
healthcheck:
test: ["CMD-SHELL", "test -f /tmp/worker-healthy || exit 1"]
interval: 10s
timeout: 3s
retries: 5
start_period: 15s
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: admin
POSTGRES_PASSWORD: adminpass
volumes:
- ./init-databases.sql:/docker-entrypoint-initdb.d/01-init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U admin"]
interval: 5s
timeout: 3s
retries: 10
rabbitmq:
image: rabbitmq:3.13-management-alpine
ports:
- "5672:5672"
- "15672:15672"
environment:
RABBITMQ_DEFAULT_USER: guest
RABBITMQ_DEFAULT_PASS: guest
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "-q", "check_running"]
interval: 10s
timeout: 5s
retries: 10
start_period: 20sThe database initialization script creates separate databases for each service:
-- init-databases.sql
CREATE USER orders WITH PASSWORD 'orderspass';
CREATE DATABASE ordersdb OWNER orders;
CREATE USER users WITH PASSWORD 'userspass';
CREATE DATABASE usersdb OWNER users;Health check script:
#!/bin/bash
# validate-microservices.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-microservices.yml"
PROJECT_NAME="micro-validation-$$"
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting microservices stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait --wait-timeout 120
echo "=== Verifying infrastructure services ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T postgres \
psql -U admin -c "SELECT datname FROM pg_database WHERE datname IN ('ordersdb', 'usersdb');"
# Verify RabbitMQ management API
curl -sf -u guest:guest http://localhost:15672/api/overview > /dev/null
echo "RabbitMQ: healthy"
echo "=== Verifying application services ==="
for svc in "api-gateway:8080" "order-service:8081" "user-service:8082"; do
NAME="${svc%%:*}"
PORT="${svc##*:}"
if curl -sf "http://localhost:${PORT}/health" > /dev/null; then
echo "${NAME}: healthy"
else
echo "ERROR: ${NAME} not responding"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" logs "${NAME}"
exit 1
fi
done
echo "=== Testing inter-service communication ==="
# Create an order through the gateway, which should call order-service
RESPONSE=$(curl -sf -X POST http://localhost:8080/api/orders \
-H "Content-Type: application/json" \
-d '{"user_id": "test-user", "items": [{"sku": "TEST-001", "qty": 1}]}')
echo "Order creation response: ${RESPONSE}"
echo "=== Checking message queue ==="
# Verify that order-events queue has been created and has activity
QUEUE_INFO=$(curl -sf -u guest:guest http://localhost:15672/api/queues/%2F/order-events)
echo "Queue status: ${QUEUE_INFO}" | python3 -c "import sys,json; q=json.load(sys.stdin); print(f'Messages: {q.get(\"messages\",0)}, Consumers: {q.get(\"consumers\",0)}')" 2>/dev/null || echo "Queue info parsed"
echo "=== ALL MICROSERVICE CHECKS PASSED ==="What this validates: Service startup order, database per service isolation, message queue connectivity, inter-service HTTP communication through the gateway, and worker consumer connectivity.
What this misses: Service mesh behavior, circuit breakers under failure conditions, service discovery beyond Docker DNS, and behavior under partial failures (one service down while others continue).
Stack 3: Full Observability (App + Prometheus + Grafana + Loki)#
Validates that an application exposes metrics correctly, that Prometheus scrapes them, that Grafana can query Prometheus, and that logs flow through Loki. Use this when validating observability instrumentation changes.
# docker-compose-observability.yml
version: "3.8"
services:
app:
image: ${APP_IMAGE:-my-app:latest}
build:
context: .
ports:
- "8080:8080"
environment:
METRICS_ENABLED: "true"
LOG_FORMAT: "json"
logging:
driver: loki
options:
loki-url: "http://localhost:3100/loki/api/v1/push"
loki-batch-size: "100"
loki-retries: "3"
loki-timeout: "5s"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
prometheus:
image: prom/prometheus:v2.51.0
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.retention.time=1h"
- "--web.enable-lifecycle"
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:9090/-/healthy"]
interval: 5s
timeout: 3s
retries: 10
grafana:
image: grafana/grafana:10.4.0
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
GF_AUTH_ANONYMOUS_ENABLED: "true"
GF_AUTH_ANONYMOUS_ORG_ROLE: Admin
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning
depends_on:
prometheus:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:3000/api/health"]
interval: 5s
timeout: 3s
retries: 10
loki:
image: grafana/loki:2.9.5
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3100/ready"]
interval: 10s
timeout: 5s
retries: 10
start_period: 15sPrometheus configuration:
# prometheus.yml
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: "app"
static_configs:
- targets: ["app:8080"]
metrics_path: "/metrics"
scrape_interval: 5sGrafana datasource provisioning:
# grafana/provisioning/datasources/datasources.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
- name: Loki
type: loki
access: proxy
url: http://loki:3100Health check script:
#!/bin/bash
# validate-observability.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-observability.yml"
PROJECT_NAME="obs-validation-$$"
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
# Loki Docker logging driver must be installed on the host
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions 2>/dev/null || true
echo "=== Starting observability stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait --wait-timeout 120
echo "=== Generating traffic for metrics ==="
for i in $(seq 1 20); do
curl -sf http://localhost:8080/health > /dev/null 2>&1 || true
curl -sf http://localhost:8080/api/test > /dev/null 2>&1 || true
done
echo "Traffic generated. Waiting for scrape interval..."
sleep 10
echo "=== Checking Prometheus targets ==="
TARGETS=$(curl -sf http://localhost:9090/api/v1/targets)
UP_COUNT=$(echo "${TARGETS}" | python3 -c "import sys,json; t=json.load(sys.stdin); print(sum(1 for a in t['data']['activeTargets'] if a['health']=='up'))" 2>/dev/null || echo "0")
echo "Active healthy targets: ${UP_COUNT}"
if [ "${UP_COUNT}" -lt 1 ]; then
echo "ERROR: No healthy Prometheus targets"
echo "${TARGETS}" | python3 -m json.tool 2>/dev/null || echo "${TARGETS}"
exit 1
fi
echo "=== Querying application metrics ==="
METRICS=$(curl -sf "http://localhost:9090/api/v1/query?query=up{job='app'}")
echo "Metric 'up' for app: ${METRICS}"
# Check for application-specific metrics
APP_METRICS=$(curl -sf "http://localhost:9090/api/v1/label/__name__/values" | python3 -c "import sys,json; names=json.load(sys.stdin)['data']; [print(n) for n in names if not n.startswith('go_') and not n.startswith('process_') and not n.startswith('promhttp_')]" 2>/dev/null || true)
if [ -n "${APP_METRICS}" ]; then
echo "Application-specific metrics found:"
echo "${APP_METRICS}"
else
echo "WARNING: No application-specific metrics found (only Go runtime and process metrics)"
fi
echo "=== Checking Grafana datasources ==="
DATASOURCES=$(curl -sf http://localhost:3000/api/datasources)
echo "Configured datasources: ${DATASOURCES}"
echo "=== Checking Loki ==="
LOKI_READY=$(curl -sf http://localhost:3100/ready)
echo "Loki status: ${LOKI_READY}"
# Query Loki for logs from the app
LOKI_LOGS=$(curl -sf "http://localhost:3100/loki/api/v1/query_range" \
--data-urlencode "query={compose_service=\"app\"}" \
--data-urlencode "start=$(date -u -v-5M +%s 2>/dev/null || date -u -d '5 minutes ago' +%s)000000000" \
--data-urlencode "end=$(date -u +%s)000000000" \
--data-urlencode "limit=5" 2>/dev/null || echo "{}")
echo "Loki log query result: received"
echo "=== ALL OBSERVABILITY CHECKS PASSED ==="
echo "Prometheus: scraping app metrics"
echo "Grafana: datasources configured (Prometheus + Loki)"
echo "Loki: receiving logs from app container"What this validates: Metrics endpoint exposure, Prometheus scrape configuration, Grafana datasource provisioning, Loki log ingestion, and the complete observability pipeline from application to dashboard.
What this misses: Alert routing (Alertmanager), dashboard correctness (just checks that datasources exist), long-term storage retention policies, and high-cardinality metric behavior.
Stack 4: Database Migration Testing#
Validates database migrations across multiple PostgreSQL versions. Runs the migration against each version to ensure compatibility. Critical for upgrades and for catching version-specific SQL syntax issues.
# docker-compose-migration-test.yml
version: "3.8"
services:
postgres14:
image: postgres:14-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5414:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres15:
image: postgres:15-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5415:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres16:
image: postgres:16-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5416:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres17:
image: postgres:17-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5417:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10Migration validation script:
#!/bin/bash
# validate-migrations.sh
# Usage: ./validate-migrations.sh <migrations-dir>
set -euo pipefail
MIGRATIONS_DIR="${1:?Usage: $0 <migrations-dir>}"
COMPOSE_FILE="docker-compose-migration-test.yml"
PROJECT_NAME="mig-validation-$$"
DB_USER="migtest"
DB_PASS="migtest"
DB_NAME="migtest"
VERSIONS=("14:5414" "15:5415" "16:5416" "17:5417")
FAILED=()
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting all PostgreSQL versions ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --wait --wait-timeout 60
echo "=== Running migrations against each version ==="
for version_port in "${VERSIONS[@]}"; do
VERSION="${version_port%%:*}"
PORT="${version_port##*:}"
echo ""
echo "--- PostgreSQL ${VERSION} (port ${PORT}) ---"
# Sort migration files and apply in order
for migration in $(ls "${MIGRATIONS_DIR}"/*.sql 2>/dev/null | sort); do
BASENAME=$(basename "${migration}")
echo " Applying: ${BASENAME}"
if PGPASSWORD="${DB_PASS}" psql -h localhost -p "${PORT}" -U "${DB_USER}" -d "${DB_NAME}" \
-f "${migration}" -v ON_ERROR_STOP=1 2>&1; then
echo " OK: ${BASENAME}"
else
echo " FAILED: ${BASENAME} on PostgreSQL ${VERSION}"
FAILED+=("pg${VERSION}:${BASENAME}")
fi
done
# Verify the final schema
echo " Verifying schema..."
TABLES=$(PGPASSWORD="${DB_PASS}" psql -h localhost -p "${PORT}" -U "${DB_USER}" -d "${DB_NAME}" \
-t -c "SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename;")
echo " Tables: $(echo ${TABLES} | tr '\n' ', ')"
done
echo ""
echo "=== MIGRATION RESULTS ==="
if [ ${#FAILED[@]} -eq 0 ]; then
echo "All migrations passed on all PostgreSQL versions."
else
echo "FAILURES:"
for f in "${FAILED[@]}"; do
echo " - ${f}"
done
exit 1
fiWhat this validates: Migration SQL compatibility across PostgreSQL major versions, schema creation order, constraint compatibility, and function/trigger syntax differences between versions.
What this misses: Performance characteristics of migrations on large datasets, locking behavior under concurrent load, and logical replication compatibility.
The Agent Workflow#
When an agent uses these stacks, it should follow this sequence:
Select the template. Based on what is being validated, choose the closest stack. If validating a web app change, use Stack 1. If validating a migration, use Stack 4.
Customize the template. Replace image names, environment variables, port mappings, and volume mounts to match the actual services. Do not modify the health check patterns unless the service has a different health endpoint.
Bring it up. Run
docker compose up -d --build --wait. The--waitflag blocks until all health checks pass or timeout.Run validation checks. Execute the health check script or run custom verification commands. Capture all output.
Capture results. Before tearing down, save logs if any checks failed:
docker compose logs > validation-results.log.Tear down. Always tear down, even on success. Use
docker compose down -v --remove-orphans. The-vflag removes volumes so the next run starts clean. The--remove-orphansflag catches containers from modified compose files.Report. State what was validated, what passed, and what the stack cannot test. Reference the fidelity limitations documented with each stack above.
The trap-based cleanup in every script ensures teardown happens even when validation fails. This is not optional. Orphaned containers consume resources and can cause port conflicts on subsequent runs.