Table of contents
- The Problem with Single-Stage
- The Basic Multi-Stage Pattern
- What Stages Do During a Build
- Language-Specific Patterns
- Caching Layers — Order Equals Speed
- BuildKit Cache Mounts
- distroless and scratch — Minimal Bases
- Build Stage Targeting
- Practical Checklist
- Where We Stand
The Problem with Single-Stage
Let’s look at a Go example. A simple single-stage Dockerfile looks like this:
FROM golang:1.22-alpine
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o /app/bin/server ./cmd/server
CMD ["/app/bin/server"]
Building this means the final image includes all of golang:1.22-alpine — the Go compiler, standard library source, all dependency source code, and the go command. At runtime, only the compiled binary needs to run.
docker build -t app:single .
docker images app:single
# REPOSITORY TAG SIZE
# app single ~350MB
Around 350MB. The binary itself is only 10-20MB.
The Basic Multi-Stage Pattern
Use FROM multiple times. Each FROM is a new stage. Only the last stage is saved as the actual image; the rest are discarded.
# Stage 1 — Build
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /out/server ./cmd/server
# Stage 2 — Runtime
FROM alpine:3.20
RUN adduser -D -u 10001 app
COPY --from=builder /out/server /usr/local/bin/server
USER app
ENTRYPOINT ["/usr/local/bin/server"]
The first stage is named AS builder, and the second stage uses COPY --from=builder to grab only the binary. The runtime image has no Go compiler at all.
docker build -t app:multi .
docker images app:multi
# REPOSITORY TAG SIZE
# app multi ~15MB
350MB to 15MB. Splitting the stages alone reduced the size by over 95%.
What Stages Do During a Build
Visualized as a diagram, the flow is straightforward:
flowchart LR
SRC["Source code"] --> B1["Stage: builder<br/>(golang:1.22)"]
B1 --> ART["/out/server<br/>(compiled binary)"]
ART -->|COPY --from=builder| B2["Stage: runtime<br/>(alpine:3.20)"]
B2 --> IMG["Final image"]
B1 -.->|Discarded| X["Builder layers"]
The builder stage is full of compilers and build tools, but they do not remain in the final image. Only the copied artifacts and the runtime base remain.
Language-Specific Patterns
The core of multi-stage builds is separating build tools from runtime, but each language has its own nuances.
Java (Gradle)
FROM gradle:8.7-jdk21 AS builder
WORKDIR /src
COPY build.gradle.kts settings.gradle.kts ./
COPY src ./src
RUN gradle clean bootJar --no-daemon
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY --from=builder /src/build/libs/*.jar app.jar
ENTRYPOINT ["java", "-jar", "/app/app.jar"]
gradle:8.7-jdk21 includes JDK + Gradle + dependency cache, weighing several hundred MB. The runtime only needs jre-alpine, so the image is much lighter. With Spring Boot, you could further split layers with layertools, but multi-stage alone already cuts things down by more than half.
Node.js
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build && npm prune --production
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./
USER node
CMD ["node", "dist/server.js"]
Node’s node_modules is particularly bloated, so separating the install into a deps stage lets the layer cache hit when package.json has not changed. Running npm prune --production after the build strips devDependencies, further reducing the final size.
Python
FROM python:3.12-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt
FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "-m", "myapp"]
Python is not a compiled language, so the gains are smaller than other languages. Still, being able to exclude compilers and dev headers (gcc, libpq-dev, etc.) from the runtime is a meaningful win.
Caching Layers — Order Equals Speed
Each Dockerfile instruction creates a layer. If the input to that layer has not changed, the cache is hit and it is reused without rebuilding. Understanding this caching rule alone can cut CI time in half.
flowchart TB
L1["COPY go.mod go.sum"] --> L2["RUN go mod download"]
L2 --> L3["COPY . ."]
L3 --> L4["RUN go build"]
subgraph KEY["Cache hit criteria"]
L1
L2
L3
L4
end
NOTE1["When only dependencies change\n→ Only L1, L2 re-execute"]
NOTE2["When a single source line changes\n→ Only L3, L4 re-execute"]
The key order is: copy dependency files first, install them, then copy the source. Since changing a single line of source is the most frequent operation, those should be in the lower layers.
Here is what happens with the reverse order:
# Bad order — even a single source line change reinstalls dependencies
COPY . .
RUN go mod download
RUN go build -o /out/server ./cmd/server
With this order, go mod download re-runs every time source changes. This pattern is the usual culprit when team-wide CI build times are inflated by several minutes.
BuildKit Cache Mounts
With BuildKit as the default engine, a more powerful caching mechanism became available. Build tool cache directories can be maintained in mount form across builds.
# syntax=docker/dockerfile:1.7
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/go/pkg/mod \
go mod download
COPY . .
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
CGO_ENABLED=0 go build -o /out/server ./cmd/server
--mount=type=cache persists build tool cache directories across builds, independent of layer cache hits. Even on a clean build, Go module caches can be reused. This feature is covered in greater depth in Part 11.
distroless and scratch — Minimal Bases
Alpine is small, but there is room to go even smaller. For static binaries, you can target scratch or gcr.io/distroless/*.
scratch — Truly Nothing
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -ldflags='-s -w' -o /out/server ./cmd/server
FROM scratch
COPY --from=builder /out/server /server
ENTRYPOINT ["/server"]
scratch is an empty image. No shell, no libc. A fully static binary built with CGO_ENABLED=0 in Go works on it. However, docker exec ... sh is impossible, making debugging difficult.
distroless — Bare Minimum Runtime
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o /out/server ./cmd/server
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder /out/server /server
USER nonroot:nonroot
ENTRYPOINT ["/server"]
distroless is a set of images from Google containing “only the bare minimum needed to run apps.” Without a shell, the attack surface is narrow, and the nonroot tag starts as a non-root user. static-debian12 is for static binaries like Go, while java-debian12 is the JRE-included version, and so on.
Comparison
| Base | Size | Shell | libc | Use Case |
|---|---|---|---|---|
| ubuntu:22.04 | ~77MB | yes | glibc | General purpose |
| debian:12-slim | ~75MB | yes | glibc | General purpose |
| alpine:3.20 | ~7MB | yes | musl | Lightweight |
| distroless/base | ~20MB | no | glibc | Runtime |
| distroless/static | ~2MB | no | none | Static binaries |
| scratch | 0MB | no | none | Static binaries |
The size numbers are approximations that may vary by a few MB. The point is that “keep only the minimum needed” is the answer.
Build Stage Targeting
In multi-stage builds, you can build only a specific stage. This is useful for patterns where you have a separate test stage:
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o /out/server ./cmd/server
FROM builder AS tester
RUN go test ./...
FROM alpine:3.20 AS runtime
COPY --from=builder /out/server /server
ENTRYPOINT ["/server"]
In CI, use --target tester to run tests only, and build the deployment image with --target runtime:
docker build --target tester -t app:test .
docker build --target runtime -t app:prod .
Practical Checklist
- Build tools/SDK go in the builder stage; runtime uses slim/alpine/distroless
COPYdependency files before source to preserve cache layers- Add
node_modules,.git, test resources, etc. to.dockerignoreto reduce build context - Separate build args/secrets to prevent sensitive information from entering the image (covered in Parts 10 and 11)
- Specify a non-root
USERfor production (covered in Part 10) - Periodically clean up unused intermediate layers with
docker builder prune
Where We Stand
Now you can choose whether to wrap the same app as a massive 350MB image or a lean 15MB image. Smaller images mean faster deployments, lower network costs, and a reduced attack surface.
In the next part, we cover where to put these images — the registry. From Docker Hub to ECR/GCR/ACR and self-hosted Harbor — authentication and tagging strategies from a practical perspective.

Loading comments...