Docker Compose Validation Stacks#
Docker Compose validates multi-service architectures without Kubernetes overhead. It answers the question: do these services actually work together? Containers start, connect, and communicate – or they fail, giving you fast feedback before you push to a cluster.
This article provides complete Compose stacks for four common validation scenarios. Each includes the full docker-compose.yml, health check scripts, and teardown procedures. The pattern for using them is always the same: clone the template, customize for your services, bring it up, validate, capture results, bring it down.
Stack 1: Web Application + PostgreSQL + Redis#
The most common stack. A web application that uses PostgreSQL for persistence and Redis for caching or session storage. Validates database connectivity, cache behavior, and the application’s startup sequence.
# docker-compose-web-stack.yml
version: "3.8"
services:
app:
image: ${APP_IMAGE:-my-app:latest}
build:
context: .
dockerfile: Dockerfile
ports:
- "8080:8080"
environment:
DATABASE_URL: "postgresql://appuser:${POSTGRES_PASSWORD:-apppass}@postgres:5432/appdb"
REDIS_URL: "redis://redis:6379/0"
APP_ENV: "test"
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
read_only: true
tmpfs:
- /tmp
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
networks:
- frontend
- backend
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: appuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-apppass}
POSTGRES_DB: appdb
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
read_only: true
tmpfs:
- /tmp
- /var/run/postgresql
cap_drop:
- ALL
cap_add:
- CHOWN
- DAC_OVERRIDE
- FOWNER
- SETGID
- SETUID
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 512M
cpus: "1.0"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U appuser -d appdb"]
interval: 5s
timeout: 3s
retries: 10
networks:
- backend
redis:
image: redis:7-alpine
read_only: true
tmpfs:
- /tmp
- /data
cap_drop:
- ALL
security_opt:
- no-new-privileges:true
deploy:
resources:
limits:
memory: 256M
cpus: "0.5"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 10
networks:
- backend
networks:
frontend:
backend:
internal: true
volumes:
postgres_data:Security patterns applied: Read-only root filesystems with tmpfs for writable directories, all capabilities dropped (PostgreSQL gets the minimum it needs), no-new-privileges, resource limits on every container, backend network is internal-only (PostgreSQL and Redis have no internet access), init scripts mounted read-only, passwords sourced from environment variables. See Securing Docker Validation Templates for the full security reference.
Health check script:
#!/bin/bash
# validate-web-stack.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-web-stack.yml"
PROJECT_NAME="web-validation-$$"
cleanup() {
echo "--- Tearing down ---"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait
echo "=== Checking service health ==="
# Verify PostgreSQL
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T postgres \
psql -U appuser -d appdb -c "SELECT 1 AS health_check;"
echo "PostgreSQL: healthy"
# Verify Redis
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T redis \
redis-cli ping
echo "Redis: healthy"
# Verify application
for i in $(seq 1 30); do
if curl -sf http://localhost:8080/health > /dev/null 2>&1; then
echo "Application: healthy"
break
fi
if [ "$i" -eq 30 ]; then
echo "ERROR: Application health check failed after 30 attempts"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" logs app
exit 1
fi
sleep 2
done
# Test database connectivity through the app
HTTP_STATUS=$(curl -sf -o /dev/null -w "%{http_code}" http://localhost:8080/health)
if [ "${HTTP_STATUS}" -ne 200 ]; then
echo "ERROR: App returned HTTP ${HTTP_STATUS}"
exit 1
fi
echo "=== All checks passed ==="What this validates: Application starts, connects to PostgreSQL, connects to Redis, and responds to health checks. The depends_on with condition: service_healthy ensures the startup order is correct.
What this misses: Production-like connection pooling, SSL/TLS between services, persistent volume behavior across restarts, and performance under load.
Stack 2: Microservices (3 Services + Message Queue + Database)#
A realistic microservices topology. Three services communicate through a message queue (RabbitMQ), with a shared database for the services that need persistence. Validates inter-service communication, queue connectivity, and service independence.
# docker-compose-microservices.yml
version: "3.8"
services:
api-gateway:
image: ${API_GATEWAY_IMAGE:-api-gateway:latest}
build:
context: ./services/api-gateway
ports:
- "8080:8080"
environment:
ORDER_SERVICE_URL: "http://order-service:8081"
USER_SERVICE_URL: "http://user-service:8082"
depends_on:
order-service:
condition: service_healthy
user-service:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
order-service:
image: ${ORDER_SERVICE_IMAGE:-order-service:latest}
build:
context: ./services/order-service
ports:
- "8081:8081"
environment:
DATABASE_URL: "postgresql://orders:orderspass@postgres:5432/ordersdb"
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
QUEUE_NAME: "order-events"
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8081/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
user-service:
image: ${USER_SERVICE_IMAGE:-user-service:latest}
build:
context: ./services/user-service
ports:
- "8082:8082"
environment:
DATABASE_URL: "postgresql://users:userspass@postgres:5432/usersdb"
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
QUEUE_NAME: "user-events"
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8082/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
notification-worker:
image: ${NOTIFICATION_IMAGE:-notification-worker:latest}
build:
context: ./services/notification-worker
environment:
RABBITMQ_URL: "amqp://guest:guest@rabbitmq:5672/"
LISTEN_QUEUES: "order-events,user-events"
depends_on:
rabbitmq:
condition: service_healthy
# Workers may not have HTTP health checks -- use a different strategy
healthcheck:
test: ["CMD-SHELL", "test -f /tmp/worker-healthy || exit 1"]
interval: 10s
timeout: 3s
retries: 5
start_period: 15s
postgres:
image: postgres:16-alpine
environment:
POSTGRES_USER: admin
POSTGRES_PASSWORD: adminpass
volumes:
- ./init-databases.sql:/docker-entrypoint-initdb.d/01-init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U admin"]
interval: 5s
timeout: 3s
retries: 10
rabbitmq:
image: rabbitmq:3.13-management-alpine
ports:
- "5672:5672"
- "15672:15672"
environment:
RABBITMQ_DEFAULT_USER: guest
RABBITMQ_DEFAULT_PASS: guest
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "-q", "check_running"]
interval: 10s
timeout: 5s
retries: 10
start_period: 20sThe database initialization script creates separate databases for each service:
-- init-databases.sql
CREATE USER orders WITH PASSWORD 'orderspass';
CREATE DATABASE ordersdb OWNER orders;
CREATE USER users WITH PASSWORD 'userspass';
CREATE DATABASE usersdb OWNER users;Health check script:
#!/bin/bash
# validate-microservices.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-microservices.yml"
PROJECT_NAME="micro-validation-$$"
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting microservices stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait --wait-timeout 120
echo "=== Verifying infrastructure services ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" exec -T postgres \
psql -U admin -c "SELECT datname FROM pg_database WHERE datname IN ('ordersdb', 'usersdb');"
# Verify RabbitMQ management API
curl -sf -u guest:guest http://localhost:15672/api/overview > /dev/null
echo "RabbitMQ: healthy"
echo "=== Verifying application services ==="
for svc in "api-gateway:8080" "order-service:8081" "user-service:8082"; do
NAME="${svc%%:*}"
PORT="${svc##*:}"
if curl -sf "http://localhost:${PORT}/health" > /dev/null; then
echo "${NAME}: healthy"
else
echo "ERROR: ${NAME} not responding"
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" logs "${NAME}"
exit 1
fi
done
echo "=== Testing inter-service communication ==="
# Create an order through the gateway, which should call order-service
RESPONSE=$(curl -sf -X POST http://localhost:8080/api/orders \
-H "Content-Type: application/json" \
-d '{"user_id": "test-user", "items": [{"sku": "TEST-001", "qty": 1}]}')
echo "Order creation response: ${RESPONSE}"
echo "=== Checking message queue ==="
# Verify that order-events queue has been created and has activity
QUEUE_INFO=$(curl -sf -u guest:guest http://localhost:15672/api/queues/%2F/order-events)
echo "Queue status: ${QUEUE_INFO}" | python3 -c "import sys,json; q=json.load(sys.stdin); print(f'Messages: {q.get(\"messages\",0)}, Consumers: {q.get(\"consumers\",0)}')" 2>/dev/null || echo "Queue info parsed"
echo "=== ALL MICROSERVICE CHECKS PASSED ==="What this validates: Service startup order, database per service isolation, message queue connectivity, inter-service HTTP communication through the gateway, and worker consumer connectivity.
What this misses: Service mesh behavior, circuit breakers under failure conditions, service discovery beyond Docker DNS, and behavior under partial failures (one service down while others continue).
Stack 3: Full Observability (App + Prometheus + Grafana + Loki)#
Validates that an application exposes metrics correctly, that Prometheus scrapes them, that Grafana can query Prometheus, and that logs flow through Loki. Use this when validating observability instrumentation changes.
# docker-compose-observability.yml
version: "3.8"
services:
app:
image: ${APP_IMAGE:-my-app:latest}
build:
context: .
ports:
- "8080:8080"
environment:
METRICS_ENABLED: "true"
LOG_FORMAT: "json"
logging:
driver: loki
options:
loki-url: "http://localhost:3100/loki/api/v1/push"
loki-batch-size: "100"
loki-retries: "3"
loki-timeout: "5s"
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8080/health"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
prometheus:
image: prom/prometheus:v2.51.0
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.retention.time=1h"
- "--web.enable-lifecycle"
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:9090/-/healthy"]
interval: 5s
timeout: 3s
retries: 10
grafana:
image: grafana/grafana:10.4.0
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
GF_AUTH_ANONYMOUS_ENABLED: "true"
GF_AUTH_ANONYMOUS_ORG_ROLE: Admin
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning
depends_on:
prometheus:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:3000/api/health"]
interval: 5s
timeout: 3s
retries: 10
loki:
image: grafana/loki:2.9.5
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
healthcheck:
test: ["CMD", "wget", "-qO-", "http://localhost:3100/ready"]
interval: 10s
timeout: 5s
retries: 10
start_period: 15sPrometheus configuration:
# prometheus.yml
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: "app"
static_configs:
- targets: ["app:8080"]
metrics_path: "/metrics"
scrape_interval: 5sGrafana datasource provisioning:
# grafana/provisioning/datasources/datasources.yml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
- name: Loki
type: loki
access: proxy
url: http://loki:3100Health check script:
#!/bin/bash
# validate-observability.sh
set -euo pipefail
COMPOSE_FILE="docker-compose-observability.yml"
PROJECT_NAME="obs-validation-$$"
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
# Loki Docker logging driver must be installed on the host
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions 2>/dev/null || true
echo "=== Starting observability stack ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --build --wait --wait-timeout 120
echo "=== Generating traffic for metrics ==="
for i in $(seq 1 20); do
curl -sf http://localhost:8080/health > /dev/null 2>&1 || true
curl -sf http://localhost:8080/api/test > /dev/null 2>&1 || true
done
echo "Traffic generated. Waiting for scrape interval..."
sleep 10
echo "=== Checking Prometheus targets ==="
TARGETS=$(curl -sf http://localhost:9090/api/v1/targets)
UP_COUNT=$(echo "${TARGETS}" | python3 -c "import sys,json; t=json.load(sys.stdin); print(sum(1 for a in t['data']['activeTargets'] if a['health']=='up'))" 2>/dev/null || echo "0")
echo "Active healthy targets: ${UP_COUNT}"
if [ "${UP_COUNT}" -lt 1 ]; then
echo "ERROR: No healthy Prometheus targets"
echo "${TARGETS}" | python3 -m json.tool 2>/dev/null || echo "${TARGETS}"
exit 1
fi
echo "=== Querying application metrics ==="
METRICS=$(curl -sf "http://localhost:9090/api/v1/query?query=up{job='app'}")
echo "Metric 'up' for app: ${METRICS}"
# Check for application-specific metrics
APP_METRICS=$(curl -sf "http://localhost:9090/api/v1/label/__name__/values" | python3 -c "import sys,json; names=json.load(sys.stdin)['data']; [print(n) for n in names if not n.startswith('go_') and not n.startswith('process_') and not n.startswith('promhttp_')]" 2>/dev/null || true)
if [ -n "${APP_METRICS}" ]; then
echo "Application-specific metrics found:"
echo "${APP_METRICS}"
else
echo "WARNING: No application-specific metrics found (only Go runtime and process metrics)"
fi
echo "=== Checking Grafana datasources ==="
DATASOURCES=$(curl -sf http://localhost:3000/api/datasources)
echo "Configured datasources: ${DATASOURCES}"
echo "=== Checking Loki ==="
LOKI_READY=$(curl -sf http://localhost:3100/ready)
echo "Loki status: ${LOKI_READY}"
# Query Loki for logs from the app
LOKI_LOGS=$(curl -sf "http://localhost:3100/loki/api/v1/query_range" \
--data-urlencode "query={compose_service=\"app\"}" \
--data-urlencode "start=$(date -u -v-5M +%s 2>/dev/null || date -u -d '5 minutes ago' +%s)000000000" \
--data-urlencode "end=$(date -u +%s)000000000" \
--data-urlencode "limit=5" 2>/dev/null || echo "{}")
echo "Loki log query result: received"
echo "=== ALL OBSERVABILITY CHECKS PASSED ==="
echo "Prometheus: scraping app metrics"
echo "Grafana: datasources configured (Prometheus + Loki)"
echo "Loki: receiving logs from app container"What this validates: Metrics endpoint exposure, Prometheus scrape configuration, Grafana datasource provisioning, Loki log ingestion, and the complete observability pipeline from application to dashboard.
What this misses: Alert routing (Alertmanager), dashboard correctness (just checks that datasources exist), long-term storage retention policies, and high-cardinality metric behavior.
Stack 4: Database Migration Testing#
Validates database migrations across multiple PostgreSQL versions. Runs the migration against each version to ensure compatibility. Critical for upgrades and for catching version-specific SQL syntax issues.
# docker-compose-migration-test.yml
version: "3.8"
services:
postgres14:
image: postgres:14-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5414:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres15:
image: postgres:15-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5415:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres16:
image: postgres:16-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5416:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10
postgres17:
image: postgres:17-alpine
environment:
POSTGRES_USER: migtest
POSTGRES_PASSWORD: migtest
POSTGRES_DB: migtest
ports:
- "5417:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U migtest"]
interval: 5s
timeout: 3s
retries: 10Migration validation script:
#!/bin/bash
# validate-migrations.sh
# Usage: ./validate-migrations.sh <migrations-dir>
set -euo pipefail
MIGRATIONS_DIR="${1:?Usage: $0 <migrations-dir>}"
COMPOSE_FILE="docker-compose-migration-test.yml"
PROJECT_NAME="mig-validation-$$"
DB_USER="migtest"
DB_PASS="migtest"
DB_NAME="migtest"
VERSIONS=("14:5414" "15:5415" "16:5416" "17:5417")
FAILED=()
cleanup() {
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" down -v --remove-orphans 2>/dev/null || true
}
trap cleanup EXIT
echo "=== Starting all PostgreSQL versions ==="
docker compose -f "${COMPOSE_FILE}" -p "${PROJECT_NAME}" up -d --wait --wait-timeout 60
echo "=== Running migrations against each version ==="
for version_port in "${VERSIONS[@]}"; do
VERSION="${version_port%%:*}"
PORT="${version_port##*:}"
echo ""
echo "--- PostgreSQL ${VERSION} (port ${PORT}) ---"
# Sort migration files and apply in order
for migration in $(ls "${MIGRATIONS_DIR}"/*.sql 2>/dev/null | sort); do
BASENAME=$(basename "${migration}")
echo " Applying: ${BASENAME}"
if PGPASSWORD="${DB_PASS}" psql -h localhost -p "${PORT}" -U "${DB_USER}" -d "${DB_NAME}" \
-f "${migration}" -v ON_ERROR_STOP=1 2>&1; then
echo " OK: ${BASENAME}"
else
echo " FAILED: ${BASENAME} on PostgreSQL ${VERSION}"
FAILED+=("pg${VERSION}:${BASENAME}")
fi
done
# Verify the final schema
echo " Verifying schema..."
TABLES=$(PGPASSWORD="${DB_PASS}" psql -h localhost -p "${PORT}" -U "${DB_USER}" -d "${DB_NAME}" \
-t -c "SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename;")
echo " Tables: $(echo ${TABLES} | tr '\n' ', ')"
done
echo ""
echo "=== MIGRATION RESULTS ==="
if [ ${#FAILED[@]} -eq 0 ]; then
echo "All migrations passed on all PostgreSQL versions."
else
echo "FAILURES:"
for f in "${FAILED[@]}"; do
echo " - ${f}"
done
exit 1
fiWhat this validates: Migration SQL compatibility across PostgreSQL major versions, schema creation order, constraint compatibility, and function/trigger syntax differences between versions.
What this misses: Performance characteristics of migrations on large datasets, locking behavior under concurrent load, and logical replication compatibility.
The Agent Workflow#
When an agent uses these stacks, it should follow this sequence:
-
Select the template. Based on what is being validated, choose the closest stack. If validating a web app change, use Stack 1. If validating a migration, use Stack 4.
-
Customize the template. Replace image names, environment variables, port mappings, and volume mounts to match the actual services. Do not modify the health check patterns unless the service has a different health endpoint.
-
Bring it up. Run
docker compose up -d --build --wait. The--waitflag blocks until all health checks pass or timeout. -
Run validation checks. Execute the health check script or run custom verification commands. Capture all output.
-
Capture results. Before tearing down, save logs if any checks failed:
docker compose logs > validation-results.log. -
Tear down. Always tear down, even on success. Use
docker compose down -v --remove-orphans. The-vflag removes volumes so the next run starts clean. The--remove-orphansflag catches containers from modified compose files. -
Report. State what was validated, what passed, and what the stack cannot test. Reference the fidelity limitations documented with each stack above.
The trap-based cleanup in every script ensures teardown happens even when validation fails. This is not optional. Orphaned containers consume resources and can cause port conflicts on subsequent runs.