Docker Part 12 — Production Best Practices

What We Want to Prevent
HEALTHCHECK Directive
- Kubernetes Is Slightly Different
Graceful Shutdown — SIGTERM Handling
- When Signals Don’t Reach the App
tini / dumb-init — PID 1 Responsibilities
Log Drivers — The Default Is Dangerous
- Rotation Configuration
- Other Drivers
Resource Limits — Don’t Let One Container Ruin the Neighborhood
- JVM Apps Need Attention
Restart Policy
Time Synchronization and Timezone
Final Checklist
Where We Stand

What We Want to Prevent

Most operational incidents fall into a few patterns:

flowchart TB
    PROD["Production issue"] --> H["Missing health check"]
    PROD --> S["Request loss on shutdown"]
    PROD --> L["Log disk saturation"]
    PROD --> R["Resource explosion / OOM"]
    PROD --> Z["Zombie processes / signal delivery failure"]
    H -->|HEALTHCHECK / readiness| F1["Liveness/readiness separation"]
    S -->|SIGTERM handling + drain| F2["Graceful shutdown"]
    L -->|Driver + rotation| F3["Log management"]
    R -->|--memory / --cpus| F4["Resource limits"]
    Z -->|tini / dumb-init| F5["PID 1 handling"]

Let’s look at each one.

HEALTHCHECK Directive

You can define a health check inside the Dockerfile. Docker Engine runs it periodically to maintain the container’s health status.

FROM alpine:3.20
RUN apk add --no-cache curl
COPY --from=builder /out/server /server
EXPOSE 8080

HEALTHCHECK --interval=15s --timeout=3s --start-period=30s --retries=3 \
  CMD curl -fsS http://localhost:8080/healthz || exit 1

ENTRYPOINT ["/server"]

--interval: Check frequency
--timeout: Response wait time
--start-period: Wait time for initial app startup (failures during this period do not count as unhealthy)
--retries: Number of consecutive failures allowed

docker ps shows a (healthy) label, and when combined with depends_on: condition: service_healthy in Compose, the startup order control from Part 7 works correctly.

Kubernetes Is Slightly Different

Kubernetes ignores the Dockerfile’s HEALTHCHECK. Instead, it offers more granular probes in the Pod spec:

containers:
  - name: web
    image: myapp:1.4.2
    startupProbe:
      httpGet: { path: /healthz, port: 8080 }
      failureThreshold: 30
      periodSeconds: 10
    livenessProbe:
      httpGet: { path: /healthz, port: 8080 }
      periodSeconds: 15
      failureThreshold: 3
    readinessProbe:
      httpGet: { path: /ready, port: 8080 }
      periodSeconds: 5

startup: Replaces liveness/readiness until initial startup completes. Useful for apps with long startup times
liveness: Is it alive? — On failure, kubelet restarts the container
readiness: Is it ready to receive traffic? — On failure, the endpoint is removed from the Service

Distinguishing liveness and readiness at the endpoint level is important. liveness should only fail when the app is truly dead. If liveness fails due to something like a temporary DB outage, it causes a restart loop.

Graceful Shutdown — SIGTERM Handling

During deployment, the old container is taken down and a new one is started. Making the old container finish processing in-flight requests before exiting is graceful shutdown.

sequenceDiagram
    participant Orch as Orchestrator
    participant Proxy as Load Balancer
    participant App as App container
    Orch->>Proxy: Remove endpoint (readiness=false)
    Orch->>App: SIGTERM
    App->>App: Start refusing new requests
    App->>App: Complete in-flight requests
    App->>App: Close DB/queue connections
    App->>Orch: Normal exit (exit 0)
    Orch->>App: (SIGKILL on timeout)

The actual implementation varies by language, but the common pattern is:

Go example:

ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
defer cancel()

srv := &http.Server{Addr: ":8080", Handler: router}
go func() {
    if err := srv.ListenAndServe(); err != http.ErrServerClosed {
        log.Fatal(err)
    }
}()

<-ctx.Done()

shutdownCtx, c := context.WithTimeout(context.Background(), 30*time.Second)
defer c()
_ = srv.Shutdown(shutdownCtx)

Node.js example:

const server = app.listen(8080);
process.on('SIGTERM', () => {
  server.close(() => process.exit(0));
  setTimeout(() => process.exit(1), 30_000).unref();
});

The orchestrator side also needs to give time. In Kubernetes, terminationGracePeriodSeconds and preStop hooks serve this role:

spec:
  terminationGracePeriodSeconds: 60
  containers:
    - name: web
      lifecycle:
        preStop:
          exec:
            command: ["sh", "-c", "sleep 10"]   # Time for LB to drain

The preStop sleep gives the LB/Service time to remove this pod from the endpoint list. If it is too short, already-routed requests are lost.

When Signals Don’t Reach the App

A common mistake: if CMD or ENTRYPOINT in the Dockerfile is written in shell form, sh -c becomes PID 1 and the app becomes a child process. SIGTERM goes to sh, and the app never receives it.

# Bad — shell form
CMD "node dist/server.js"

# Good — exec form
CMD ["node", "dist/server.js"]

Exec form makes the app PID 1 directly, receiving signals without intermediaries.

tini / dumb-init — PID 1 Responsibilities

PID 1 is special in Unix. It must reap zombie processes and has different signal delivery behavior from regular processes. Runtimes like Node.js and Python do not handle these responsibilities well. Apps that fork child processes can accumulate zombies that gnaw at memory.

The fix is lightweight: use tini (or dumb-init) as the init process.

FROM node:20-alpine
RUN apk add --no-cache tini
WORKDIR /app
COPY . .
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]

Docker itself provides the same effect with the --init option:

docker run --init myapp:1.4.2

In Compose, set init: true:

services:
  web:
    image: myapp:1.4.2
    init: true

In Kubernetes, this is generally not a major issue for containers with few child processes, but if the app forks shell scripts or multiple processes, consider using shareProcessNamespace along with an init process.

Log Drivers — The Default Is Dangerous

Docker’s default log driver is json-file. As the name suggests, it writes logs as JSON to a local file. Left unconfigured, logs keep accumulating until the disk is full.

Rotation Configuration

// /etc/docker/daemon.json
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "5"
  }
}

Add this at the daemon level and apply with systemctl restart docker. When a file exceeds 10MB, it rolls over to a new file, keeping a maximum of 5 files. Per-container overrides are also possible:

docker run --log-driver json-file \
  --log-opt max-size=10m --log-opt max-file=5 \
  myapp:1.4.2

Other Drivers

Several drivers exist for sending logs to different destinations:

Driver	Purpose
`json-file`	Local file (default)
`local`	Local + compressed
`journald`	To systemd journal
`syslog`	To syslog server
`fluentd`	To fluentd agent
`gelf`	To Graylog or other GELF receivers
`awslogs`	To CloudWatch Logs
`gcplogs`	To GCP Cloud Logging

The standard production pattern is containers write only to stdout/stderr, and a collector like Fluent Bit or Vector on the host forwards logs externally. This way, you do not need to change the driver at the container level, and the collector can handle buffering and re-routing.

flowchart LR
    APP["App container"] -->|stdout/stderr| FILE["/var/lib/docker/containers/<id>/<id>-json.log"]
    FILE --> AGENT["Fluent Bit / Vector"]
    AGENT --> ES["Elasticsearch / Loki"]
    AGENT --> S3["S3 / GCS"]

Resource Limits — Don’t Let One Container Ruin the Neighborhood

Without resource limits, a single memory leak can shake the entire host.

docker run \
  --memory=512m --memory-swap=512m \
  --cpus=1.0 \
  --pids-limit=200 \
  myapp:1.4.2

--memory: Memory ceiling. Exceeding it triggers the OOM killer to terminate the container
--memory-swap: Ceiling including swap. Default is 2x memory. Set to the same value to disable swap
--cpus: Number of CPUs (decimals allowed)
--pids-limit: Fork bomb defense

In Compose, these go under deploy.resources:

services:
  web:
    image: myapp:1.4.2
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
        reservations:
          cpus: "0.25"
          memory: 128M

deploy.resources was originally a Swarm-only key, but since Compose v3 it is also interpreted in regular Compose.

Kubernetes uses resources.requests and resources.limits for the same concept. The scheduler places pods based on requests, and limits become the runtime ceiling.

resources:
  requests:
    cpu: 250m
    memory: 128Mi
  limits:
    cpu: 1
    memory: 512Mi

JVM Apps Need Attention

The JVM historically ignored container limits and sized the heap based on total host memory. Since Java 11, it recognizes container limits by default, but with older versions or without -Xmx, the problem can still occur.

java -XX:MaxRAMPercentage=75 -jar app.jar

Specifying what percentage of the container memory limit to allocate to the heap via -XX:MaxRAMPercentage is the safe pattern.

Restart Policy

Set a policy so that containers automatically come back up when the host reboots or the container dies:

docker run --restart=unless-stopped myapp:1.4.2

no (default): No restart
on-failure[:N]: Only on abnormal exit, up to N times
always: Always (also restarts on daemon restart, even after manual stop)
unless-stopped: Always (but if manually stopped, stays stopped even on daemon restart)

For single-host production deployments (Compose), unless-stopped is the safe default. Kubernetes already manages restarts via Deployments, so a separate policy is not needed.

Time Synchronization and Timezone

Whether log timestamps are in UTC or KST — establishing a team standard matters. Embed the timezone in the image:

FROM alpine:3.20
RUN apk add --no-cache tzdata
ENV TZ=Asia/Seoul

Alpine does not include tzdata by default, so install it explicitly. If UTC is preferred, use TZ=UTC. When containers have mixed timezones, log correlation analysis becomes a nightmare.

Final Checklist

Verify the following before production deployment:

CMD/ENTRYPOINT is written in exec form
HEALTHCHECK or Kubernetes probes are configured
The app handles SIGTERM and implements graceful shutdown
terminationGracePeriodSeconds or Compose shutdown timeout is generous enough
Logs go to stdout/stderr and driver rotation is configured
--memory, --cpus, or resource limits are set
A restart policy is specified
--init or tini is used to avoid PID 1 issues
Security settings from Part 10 are applied together

Where We Stand

We have covered the major pillars of baking images well, distributing them well, and running them well. The final part covers what to do when things go wrong, and what alternatives to Docker exist.

The next part is troubleshooting and alternatives. The meaning of common error codes, when to use logs/inspect/events/stats/top, and differences with alternatives like Podman and containerd — wrapping up the series.

→ Part 13: Troubleshooting and Alternatives

Docker Part 12 — Production Best Practices

Table of contents