Table of contents
- What We Want to Prevent
- HEALTHCHECK Directive
- Graceful Shutdown — SIGTERM Handling
- tini / dumb-init — PID 1 Responsibilities
- Log Drivers — The Default Is Dangerous
- Resource Limits — Don’t Let One Container Ruin the Neighborhood
- Restart Policy
- Time Synchronization and Timezone
- Final Checklist
- Where We Stand
What We Want to Prevent
Most operational incidents fall into a few patterns:
flowchart TB
PROD["Production issue"] --> H["Missing health check"]
PROD --> S["Request loss on shutdown"]
PROD --> L["Log disk saturation"]
PROD --> R["Resource explosion / OOM"]
PROD --> Z["Zombie processes / signal delivery failure"]
H -->|HEALTHCHECK / readiness| F1["Liveness/readiness separation"]
S -->|SIGTERM handling + drain| F2["Graceful shutdown"]
L -->|Driver + rotation| F3["Log management"]
R -->|--memory / --cpus| F4["Resource limits"]
Z -->|tini / dumb-init| F5["PID 1 handling"]
Let’s look at each one.
HEALTHCHECK Directive
You can define a health check inside the Dockerfile. Docker Engine runs it periodically to maintain the container’s health status.
FROM alpine:3.20
RUN apk add --no-cache curl
COPY --from=builder /out/server /server
EXPOSE 8080
HEALTHCHECK --interval=15s --timeout=3s --start-period=30s --retries=3 \
CMD curl -fsS http://localhost:8080/healthz || exit 1
ENTRYPOINT ["/server"]
--interval: Check frequency--timeout: Response wait time--start-period: Wait time for initial app startup (failures during this period do not count as unhealthy)--retries: Number of consecutive failures allowed
docker ps shows a (healthy) label, and when combined with depends_on: condition: service_healthy in Compose, the startup order control from Part 7 works correctly.
Kubernetes Is Slightly Different
Kubernetes ignores the Dockerfile’s HEALTHCHECK. Instead, it offers more granular probes in the Pod spec:
containers:
- name: web
image: myapp:1.4.2
startupProbe:
httpGet: { path: /healthz, port: 8080 }
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet: { path: /healthz, port: 8080 }
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet: { path: /ready, port: 8080 }
periodSeconds: 5
- startup: Replaces liveness/readiness until initial startup completes. Useful for apps with long startup times
- liveness: Is it alive? — On failure, kubelet restarts the container
- readiness: Is it ready to receive traffic? — On failure, the endpoint is removed from the Service
Distinguishing liveness and readiness at the endpoint level is important. liveness should only fail when the app is truly dead. If liveness fails due to something like a temporary DB outage, it causes a restart loop.
Graceful Shutdown — SIGTERM Handling
During deployment, the old container is taken down and a new one is started. Making the old container finish processing in-flight requests before exiting is graceful shutdown.
sequenceDiagram
participant Orch as Orchestrator
participant Proxy as Load Balancer
participant App as App container
Orch->>Proxy: Remove endpoint (readiness=false)
Orch->>App: SIGTERM
App->>App: Start refusing new requests
App->>App: Complete in-flight requests
App->>App: Close DB/queue connections
App->>Orch: Normal exit (exit 0)
Orch->>App: (SIGKILL on timeout)
The actual implementation varies by language, but the common pattern is:
Go example:
ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT)
defer cancel()
srv := &http.Server{Addr: ":8080", Handler: router}
go func() {
if err := srv.ListenAndServe(); err != http.ErrServerClosed {
log.Fatal(err)
}
}()
<-ctx.Done()
shutdownCtx, c := context.WithTimeout(context.Background(), 30*time.Second)
defer c()
_ = srv.Shutdown(shutdownCtx)
Node.js example:
const server = app.listen(8080);
process.on('SIGTERM', () => {
server.close(() => process.exit(0));
setTimeout(() => process.exit(1), 30_000).unref();
});
The orchestrator side also needs to give time. In Kubernetes, terminationGracePeriodSeconds and preStop hooks serve this role:
spec:
terminationGracePeriodSeconds: 60
containers:
- name: web
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"] # Time for LB to drain
The preStop sleep gives the LB/Service time to remove this pod from the endpoint list. If it is too short, already-routed requests are lost.
When Signals Don’t Reach the App
A common mistake: if CMD or ENTRYPOINT in the Dockerfile is written in shell form, sh -c becomes PID 1 and the app becomes a child process. SIGTERM goes to sh, and the app never receives it.
# Bad — shell form
CMD "node dist/server.js"
# Good — exec form
CMD ["node", "dist/server.js"]
Exec form makes the app PID 1 directly, receiving signals without intermediaries.
tini / dumb-init — PID 1 Responsibilities
PID 1 is special in Unix. It must reap zombie processes and has different signal delivery behavior from regular processes. Runtimes like Node.js and Python do not handle these responsibilities well. Apps that fork child processes can accumulate zombies that gnaw at memory.
The fix is lightweight: use tini (or dumb-init) as the init process.
FROM node:20-alpine
RUN apk add --no-cache tini
WORKDIR /app
COPY . .
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "dist/server.js"]
Docker itself provides the same effect with the --init option:
docker run --init myapp:1.4.2
In Compose, set init: true:
services:
web:
image: myapp:1.4.2
init: true
In Kubernetes, this is generally not a major issue for containers with few child processes, but if the app forks shell scripts or multiple processes, consider using shareProcessNamespace along with an init process.
Log Drivers — The Default Is Dangerous
Docker’s default log driver is json-file. As the name suggests, it writes logs as JSON to a local file. Left unconfigured, logs keep accumulating until the disk is full.
Rotation Configuration
// /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "5"
}
}
Add this at the daemon level and apply with systemctl restart docker. When a file exceeds 10MB, it rolls over to a new file, keeping a maximum of 5 files. Per-container overrides are also possible:
docker run --log-driver json-file \
--log-opt max-size=10m --log-opt max-file=5 \
myapp:1.4.2
Other Drivers
Several drivers exist for sending logs to different destinations:
| Driver | Purpose |
|---|---|
json-file | Local file (default) |
local | Local + compressed |
journald | To systemd journal |
syslog | To syslog server |
fluentd | To fluentd agent |
gelf | To Graylog or other GELF receivers |
awslogs | To CloudWatch Logs |
gcplogs | To GCP Cloud Logging |
The standard production pattern is containers write only to stdout/stderr, and a collector like Fluent Bit or Vector on the host forwards logs externally. This way, you do not need to change the driver at the container level, and the collector can handle buffering and re-routing.
flowchart LR
APP["App container"] -->|stdout/stderr| FILE["/var/lib/docker/containers/<id>/<id>-json.log"]
FILE --> AGENT["Fluent Bit / Vector"]
AGENT --> ES["Elasticsearch / Loki"]
AGENT --> S3["S3 / GCS"]
Resource Limits — Don’t Let One Container Ruin the Neighborhood
Without resource limits, a single memory leak can shake the entire host.
docker run \
--memory=512m --memory-swap=512m \
--cpus=1.0 \
--pids-limit=200 \
myapp:1.4.2
--memory: Memory ceiling. Exceeding it triggers the OOM killer to terminate the container--memory-swap: Ceiling including swap. Default is 2x memory. Set to the same value to disable swap--cpus: Number of CPUs (decimals allowed)--pids-limit: Fork bomb defense
In Compose, these go under deploy.resources:
services:
web:
image: myapp:1.4.2
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
reservations:
cpus: "0.25"
memory: 128M
deploy.resources was originally a Swarm-only key, but since Compose v3 it is also interpreted in regular Compose.
Kubernetes uses resources.requests and resources.limits for the same concept. The scheduler places pods based on requests, and limits become the runtime ceiling.
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 1
memory: 512Mi
JVM Apps Need Attention
The JVM historically ignored container limits and sized the heap based on total host memory. Since Java 11, it recognizes container limits by default, but with older versions or without -Xmx, the problem can still occur.
java -XX:MaxRAMPercentage=75 -jar app.jar
Specifying what percentage of the container memory limit to allocate to the heap via -XX:MaxRAMPercentage is the safe pattern.
Restart Policy
Set a policy so that containers automatically come back up when the host reboots or the container dies:
docker run --restart=unless-stopped myapp:1.4.2
no(default): No restarton-failure[:N]: Only on abnormal exit, up to N timesalways: Always (also restarts on daemon restart, even after manual stop)unless-stopped: Always (but if manually stopped, stays stopped even on daemon restart)
For single-host production deployments (Compose), unless-stopped is the safe default. Kubernetes already manages restarts via Deployments, so a separate policy is not needed.
Time Synchronization and Timezone
Whether log timestamps are in UTC or KST — establishing a team standard matters. Embed the timezone in the image:
FROM alpine:3.20
RUN apk add --no-cache tzdata
ENV TZ=Asia/Seoul
Alpine does not include tzdata by default, so install it explicitly. If UTC is preferred, use TZ=UTC. When containers have mixed timezones, log correlation analysis becomes a nightmare.
Final Checklist
Verify the following before production deployment:
-
CMD/ENTRYPOINTis written in exec form - HEALTHCHECK or Kubernetes probes are configured
- The app handles SIGTERM and implements graceful shutdown
-
terminationGracePeriodSecondsor Compose shutdown timeout is generous enough - Logs go to stdout/stderr and driver rotation is configured
-
--memory,--cpus, or resource limits are set - A restart policy is specified
-
--initor tini is used to avoid PID 1 issues - Security settings from Part 10 are applied together
Where We Stand
We have covered the major pillars of baking images well, distributing them well, and running them well. The final part covers what to do when things go wrong, and what alternatives to Docker exist.
The next part is troubleshooting and alternatives. The meaning of common error codes, when to use logs/inspect/events/stats/top, and differences with alternatives like Podman and containerd — wrapping up the series.

Loading comments...