Container Build Optimization#

A container build that takes eight minutes in CI is not just slow – it compounds across every push, every developer, every day. The difference between a naive Dockerfile and an optimized one is often the difference between a two-minute build and a twelve-minute build. The techniques here are not theoretical. They are the specific changes that eliminate wasted time.

BuildKit Over Legacy Builder#

BuildKit is the modern Docker build engine and the default since Docker 23.0. If you are running an older version, enable it explicitly with DOCKER_BUILDKIT=1. BuildKit provides several capabilities the legacy builder lacks.

Parallel stage execution means independent stages in a multi-stage build run concurrently. If your Dockerfile has a builder stage and a test-deps stage that do not depend on each other, BuildKit runs them simultaneously.

Better cache management means BuildKit tracks dependencies between layers more precisely, leading to fewer unnecessary cache invalidations. It also supports external cache sources and sinks, which is critical for CI where there is no local cache between runs.

Build secrets and SSH forwarding let you pass credentials to the build process without baking them into image layers. The legacy builder had no safe way to do this.

Multi-Stage Builds#

The most impactful optimization for image size and security is separating the build environment from the runtime environment:

# Stage 1: build
FROM golang:1.23-alpine AS builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /app ./cmd/server

# Stage 2: runtime
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app /app
ENTRYPOINT ["/app"]

The builder stage has Go, git, a C compiler, and hundreds of megabytes of toolchain. The runtime stage has only the compiled binary. The final image is typically 10-20 MB instead of 800+ MB. This also reduces the attack surface – no shell, no package manager, no unnecessary utilities.

For Node.js applications, the pattern is similar:

FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production=false
COPY . .
RUN npm run build

FROM node:20-alpine AS runtime
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production && npm cache clean --force
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]

Layer Caching Strategy#

Docker caches each layer and reuses it when the inputs have not changed. The moment a layer’s inputs change, that layer and every layer after it must be rebuilt. This means layer order determines cache effectiveness.

The correct pattern is: install OS packages first, then copy dependency manifests, then install dependencies, then copy source code. Source code changes on every commit, so it should be the last thing copied:

FROM golang:1.23-alpine

# Rarely changes - cached almost always
RUN apk add --no-cache git ca-certificates

# Changes when dependencies change - cached most of the time
COPY go.mod go.sum ./
RUN go mod download

# Changes on every commit - never cached
COPY . .
RUN go build -o /app ./cmd/server

If you copy all source code before running go mod download, the dependency download re-runs on every commit even when dependencies have not changed. This single reordering often saves minutes per build.

Cache Mounts#

Cache mounts persist package manager caches between builds. Unlike layer caching, which only helps when the layer inputs are identical, cache mounts benefit rebuilds even when dependencies change because the package manager can reuse previously downloaded packages:

RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    go build -o /app ./cmd/server

For other ecosystems:

# Python pip
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

# Node.js npm
RUN --mount=type=cache,target=/root/.npm \
    npm ci

# APT packages
RUN --mount=type=cache,target=/var/cache/apt \
    --mount=type=cache,target=/var/lib/apt/lists \
    apt-get update && apt-get install -y build-essential

Cache mounts are especially effective in local development where you are rebuilding frequently. In CI, they require external cache storage to persist between runs.

Build Secrets#

Never put credentials in Dockerfile ARG or ENV instructions. They are visible in the image history. Use BuildKit secrets instead:

RUN --mount=type=secret,id=npm_token \
    NPM_TOKEN=$(cat /run/secrets/npm_token) \
    npm install --registry https://npm.pkg.github.com

Build with:

docker build --secret id=npm_token,src=$HOME/.npmrc .

The secret is available during the build step but is never persisted in any layer.

.dockerignore#

A large build context slows every build because Docker must send the entire context to the build daemon before anything starts. Create a .dockerignore file:

.git
node_modules
*.md
docs/
test/
.env
.env.*
*.test.go
*.test.js
coverage/
.github/

Excluding .git alone can save hundreds of megabytes of context transfer. Excluding node_modules prevents sending local dependencies that will be installed fresh in the container.

CI-Specific Cache Configuration#

In CI environments, there is no local Docker layer cache between runs. You must export and import caches explicitly.

For GitHub Actions with the GitHub Actions cache backend:

- uses: docker/build-push-action@v5
  with:
    context: .
    push: true
    tags: myapp:latest
    cache-from: type=gha
    cache-to: type=gha,mode=max

For registry-based caching (works with any CI):

docker buildx build \
  --cache-from type=registry,ref=registry.example.com/myapp:cache \
  --cache-to type=registry,ref=registry.example.com/myapp:cache,mode=max \
  --push -t registry.example.com/myapp:latest .

The mode=max setting caches all layers from all stages, not just the final image layers. This is important for multi-stage builds where intermediate layers (dependency installation, compilation) are the most expensive to recreate.

Image Size Reduction#

Choose your base image deliberately:

  • scratch: empty image. Only works for statically compiled binaries (Go with CGO_ENABLED=0, Rust). Smallest possible size. No shell, no debugging tools.
  • distroless: Google-maintained images with only the runtime (libc, ca-certificates). No shell, no package manager. 2-5 MB for static, 20 MB for Java.
  • alpine: musl libc, busybox, apk package manager. 5 MB base. Good when you need a shell for debugging or additional packages. Watch for musl compatibility issues with some C libraries.
  • debian-slim: glibc, minimal packages. 80 MB base. Use when you need glibc compatibility and are willing to accept the size.

Always clean up package manager caches in the same RUN instruction as the install:

RUN apt-get update && \
    apt-get install -y --no-install-recommends ca-certificates && \
    rm -rf /var/lib/apt/lists/*

If the rm is in a separate RUN instruction, the cached files still exist in the previous layer and the image is no smaller.

Build Reproducibility#

Pin base images by digest, not just by tag:

FROM node:20-alpine@sha256:abcdef1234567890...

Tags are mutable. Someone can push a new image to node:20-alpine at any time. A digest is immutable – it always refers to the exact same image. This prevents builds from breaking due to upstream base image changes and ensures everyone building from the same Dockerfile gets the same result.

Always use lockfiles (package-lock.json, go.sum, Pipfile.lock) and install from them deterministically (npm ci instead of npm install).

Multi-Architecture Builds#

Build for both Intel and ARM in a single command:

docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --push -t registry.example.com/myapp:latest .

This creates a manifest list so that clients on either architecture pull the correct image automatically. For Go, the cross-compilation happens natively. For C/C++ dependencies, BuildKit uses QEMU emulation, which can be slow. In that case, consider building natively on architecture-specific runners and creating the manifest list separately.

BuildKit’s COPY --link flag creates independent layers that do not depend on the previous layer’s filesystem state:

FROM alpine
COPY --link myapp /usr/local/bin/myapp

Without --link, changing the base image invalidates the COPY layer even if the file being copied is identical. With --link, the COPY layer is reused regardless of base image changes. This is particularly valuable when you update base images for security patches but your application binary has not changed.

Alternative Build Tools#

Kaniko builds container images inside a Kubernetes cluster without requiring a Docker daemon. It runs as a regular container, making it suitable for environments where you cannot run Docker-in-Docker (Kubernetes pods, rootless CI).

ko is a Go-specific tool that builds Go container images without a Dockerfile. It compiles your Go binary and packages it into a distroless base image. Extremely fast because it skips the entire Dockerfile layer system.

Jib does the same for Java. It builds optimized Docker and OCI images without a Dockerfile and without requiring Docker installed locally. It separates dependencies, resources, and classes into different layers for optimal caching.

Buildpacks (Cloud Native Buildpacks) detect your application type and produce an OCI image without a Dockerfile. Heroku and Google Cloud Run use buildpacks. They trade control for convenience – you do not write a Dockerfile, but you also cannot customize every layer.

Common Gotchas#

A large build context is the most frequent cause of unexpectedly slow builds. If your build starts with “Sending build context to Docker daemon 2.1GB”, you need a .dockerignore file immediately.

File timestamp changes invalidate layer caches even when file contents have not changed. This commonly happens with git clone or git checkout operations that reset timestamps. Use COPY --link where possible to avoid this.

Using :latest tags for base images means your build is not reproducible. Pin to specific versions or digests. When you must use apt-get update, pair it with pinned package versions to prevent different builds from installing different package versions.