Database Testing Strategies#
Database tests are the tests most teams get wrong. They either skip them entirely (testing with mocks, then discovering schema mismatches in production), or they build a fragile suite sharing a single database where tests interfere with each other. The right approach depends on what you are testing and what tradeoffs you can accept.
Fixtures vs Factories#
Fixtures#
Fixtures are static SQL files loaded before a test suite runs:
-- fixtures/users.sql
INSERT INTO users (id, email, role) VALUES
(1, 'admin@test.com', 'admin'),
(2, 'user@test.com', 'member');
-- fixtures/orders.sql
INSERT INTO orders (id, user_id, total, status) VALUES
(100, 2, 49.99, 'completed'),
(101, 2, 129.00, 'pending');Use fixtures when: the domain model is stable, tests need complex interconnected data, and you want fast loading. Fixtures break when: schema changes require updating every fixture file – a new required column means editing every INSERT across all files.
Factories#
Factories generate test data on demand with sensible defaults:
import factory
from myapp.models import User, Order
class UserFactory(factory.Factory):
class Meta:
model = User
email = factory.Sequence(lambda n: f"user{n}@test.com")
role = "member"
class OrderFactory(factory.Factory):
class Meta:
model = Order
user = factory.SubFactory(UserFactory)
total = factory.Faker("pydecimal", left_digits=3, right_digits=2, positive=True)
status = "pending"
# Each test builds exactly what it needs
def test_order_completion():
order = OrderFactory(status="pending", total=49.99)
complete_order(order.id)
assert Order.get(order.id).status == "completed"Use factories when: schema changes frequently and each test should be self-documenting. Factories break when: you need complex graphs of related data – building an order that requires a user, product, inventory, warehouse, and shipping zone becomes a chain of factory calls.
The pragmatic approach: use both. Factories for test-specific data, fixtures for reference data that rarely changes (countries, currencies, permission definitions).
Database Containers with Testcontainers#
Testcontainers spins up real database instances in Docker for each test run. No shared test databases, no version mismatches between local and CI.
@Testcontainers
class OrderRepositoryTest {
@Container
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
.withDatabaseName("testdb")
.withInitScript("schema.sql");
@Test
void shouldCreateOrder() {
OrderRepository repo = new OrderRepository(dataSource);
Order order = repo.create(new Order(userId, 49.99));
assertThat(order.getId()).isNotNull();
}
}func TestOrderRepository(t *testing.T) {
ctx := context.Background()
container, err := postgres.Run(ctx, "postgres:16",
postgres.WithDatabase("testdb"),
)
require.NoError(t, err)
defer container.Terminate(ctx)
connStr, _ := container.ConnectionString(ctx, "sslmode=disable")
db, _ := sql.Open("pgx", connStr)
runMigrations(db)
// test against real PostgreSQL
}Use Testcontainers for: integration tests against real database engines, testing database-specific features (jsonb queries, full-text search), CI pipelines needing reproducible state. Skip it for: unit tests where in-memory databases or mocks suffice and container startup time (2-5 seconds) hurts fast feedback.
Container Lifecycle#
One container per suite with transaction rollback between tests balances isolation and speed:
@pytest.fixture(autouse=True)
def db_transaction(db_connection):
transaction = db_connection.begin()
yield db_connection
transaction.rollback() # every test starts cleanMigration Testing#
Schema migrations are deployments that run DDL against production data. Test them like deployments.
Forward and Rollback Testing#
#!/bin/bash
# Run in CI on every PR with migrations
docker run -d --name migration-test -e POSTGRES_DB=testdb postgres:16
sleep 3
# Apply all migrations up to previous release
flyway -url=jdbc:postgresql://localhost:5432/testdb \
-target=previous_release migrate
# Load representative test data
psql -h localhost -d testdb -f test-data/representative-dataset.sql
# Apply new migrations -- this is what we are testing
flyway -url=jdbc:postgresql://localhost:5432/testdb migrate
# Validate schema matches expected state
pg_dump -h localhost -d testdb --schema-only > actual_schema.sql
diff expected_schema.sql actual_schema.sqlEvery migration should have a rollback script. Test both directions:
-- V5__add_order_status.sql (forward)
ALTER TABLE orders ADD COLUMN status VARCHAR(20) DEFAULT 'pending';
-- V5__add_order_status_rollback.sql (reverse)
ALTER TABLE orders DROP COLUMN status;Data Migration Testing#
When migrations transform data, test with production-scale volumes. A migration taking 200ms on 1000 rows might take 45 minutes on 10 million rows, locking the table the entire time. Run data migrations against a masked copy of production data to discover this before the maintenance window.
Performance Regression Testing#
Database performance degrades gradually. A query that took 5ms last month takes 50ms now because the table grew and nobody noticed the missing index.
Baseline and Regression Detection#
# Capture EXPLAIN ANALYZE for critical queries in CI
psql -h localhost -d testdb <<'EOF' > baseline/order_lookup.json
EXPLAIN (ANALYZE, FORMAT JSON)
SELECT o.*, u.email FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days';
EOFdef check_regression(baseline_file, current_file, threshold=2.0):
baseline = json.load(open(baseline_file))
current = json.load(open(current_file))
baseline_time = baseline[0]["Execution Time"]
current_time = current[0]["Execution Time"]
if current_time > baseline_time * threshold:
print(f"REGRESSION: {baseline_time:.1f}ms -> {current_time:.1f}ms")
return False
return TrueRun with a dataset large enough to reveal performance differences. A 1000-row test database will not expose a missing index on a low-cardinality column.
Data Masking for Test Environments#
Test environments need realistic data. Production data contains PII. Data masking bridges the gap.
-- Mask emails (preserve domain distribution)
UPDATE users SET email = 'user_' || id || '@' ||
CASE (id % 3)
WHEN 0 THEN 'gmail.com'
WHEN 1 THEN 'company.com'
WHEN 2 THEN 'outlook.com'
END;
-- Mask names
UPDATE users SET first_name = 'First' || id, last_name = 'Last' || id;
-- Mask phone numbers (preserve format)
UPDATE users SET phone = '+1555' || LPAD((id % 10000000)::TEXT, 7, '0');Mask in dependency order (parent tables first) and verify referential integrity after masking. Automate the pipeline: nightly dump of production, mask in an isolated database, validate, restore to staging.
Decision Matrix#
| Scenario | Approach | Why |
|---|---|---|
| Unit testing repository logic | Factories + Testcontainers | Isolated, each test defines its own state |
| Testing DB-specific features | Testcontainers | Real engine, reproducible |
| Testing migrations | Forward + rollback against representative data | Catches data bugs before production |
| Performance regression detection | Baseline capture in CI with production-scale data | Prevents gradual degradation |
| Staging environment data | Masked production dump | Realistic without PII exposure |
| Reference/seed data | Fixtures | Stable data loaded once |
The common thread: test against real databases, with realistic data, in isolated environments. Mocks catch logic errors but miss the schema mismatches, query performance issues, and migration failures that cause production incidents.