Table of contents
- The Big Picture of Diagnosis
- Common Exit Code Interpretation
- Reading Logs
- inspect — All the Metadata
- events — What Happened in Chronological Order
- stats — Real-Time Resources
- top — Processes Inside the Container
- exec — Getting Inside the Container
- Common Error Patterns and Solutions
- Docker Alternatives
- Migration Considerations
- Wrapping Up the Docker Series
The Big Picture of Diagnosis
Here is a flow for deciding where to start when an incident occurs:
flowchart TB
START["Container anomaly"] --> Q1{"Visible in<br/>docker ps?"}
Q1 -->|No| NOEXIST["docker ps -a<br/>Check exit code"]
Q1 -->|Yes| Q2{"State is<br/>Running?"}
Q2 -->|Restarting| LOGS["docker logs<br/>+ events"]
Q2 -->|Running| Q3{"Are requests<br/>getting through?"}
Q3 -->|No| NET["inspect network<br/>+ check ports"]
Q3 -->|Slow/errors| STATS["stats / top<br/>Check resources"]
NOEXIST --> EXITC{"Exit Code"}
EXITC -->|0| NORMAL["Normal exit"]
EXITC -->|1| APP["App error"]
EXITC -->|125| CLI["docker command error"]
EXITC -->|126| PERM["Not executable (permission)"]
EXITC -->|127| NOT["Command not found"]
EXITC -->|137| OOM["SIGKILL (likely OOM)"]
EXITC -->|143| TERM["SIGTERM (normal shutdown path)"]
Just looking at the exit code narrows down the cause by half. Let’s go through them.
Common Exit Code Interpretation
| Exit code | Meaning | Common Cause |
|---|---|---|
| 0 | Normal exit | App finished as intended |
| 1 | Application-level error | Check stack trace |
| 125 | Docker command itself failed | Option/image name typo |
| 126 | Command inside container not executable | Permission denied, missing execute bit |
| 127 | Command not found | PATH issue, binary not present |
| 137 | Killed by SIGKILL | OOM killer or liveness failure |
| 139 | SIGSEGV | Native crash |
| 143 | Killed by SIGTERM | Orchestrator’s normal shutdown path |
The most misunderstood is 137. It is easy to immediately conclude “it’s OOM,” but in reality, Kubernetes liveness failure where kubelet sends SIGKILL also produces 137. If there is no OOM message in logs and the liveness config is strict, suspect the probe first.
Verification commands:
docker ps -a --format 'table {{.Names}}\t{{.Status}}\t{{.RunningFor}}'
docker inspect <container> --format '{{.State.ExitCode}} {{.State.OOMKilled}} {{.State.Error}}'
If OOMKilled: true, it is a confirmed memory exceeded.
Reading Logs
docker logs shows the container’s stdout/stderr as-is. Use -f to follow in real time.
docker logs <container>
docker logs -f <container>
docker logs --tail 200 <container>
docker logs --since 10m <container>
docker logs --timestamps <container>
With Compose, use the service name:
docker compose logs -f web
docker compose logs --tail 100 db
One important note: if the app only writes logs to a file and not stdout, docker logs shows nothing. You need to configure the app to output to standard out (Node: console.log, Python: sys.stdout, Spring Boot: remove logging.file.name config, etc.).
inspect — All the Metadata
Extracts detailed information as JSON for containers, images, networks, and volumes.
# Container state
docker inspect <container>
# Specific fields only
docker inspect <container> --format '{{.State.Status}}'
docker inspect <container> --format '{{.NetworkSettings.IPAddress}}'
docker inspect <container> --format '{{range .Mounts}}{{.Source}} -> {{.Destination}}{{"\n"}}{{end}}'
# Image
docker inspect <image>
# Network
docker network inspect bridge
Fields that quickly narrow down issues:
.State.Health.Status— HEALTHCHECK status.State.Health.Log— Recent healthcheck results (e.g., last 5).Config.Env— Environment variables (a problem if secrets are exposed here).HostConfig.Memory,.HostConfig.NanoCpus— Actually applied resource limits.NetworkSettings.Networks— Networks joined and IPs
events — What Happened in Chronological Order
docker events is Docker Daemon’s real-time event stream. Useful when you want to trace why a container died in chronological order.
# Real-time
docker events
# Filtered
docker events --filter 'event=die' --filter 'event=kill'
# Past events
docker events --since '1h' --until '10m'
Example output:
2026-04-20T12:34:56 container die 5e9a...(image=myapp:1.4.2, exitCode=137)
2026-04-20T12:34:56 container oom 5e9a...
If an oom event appears alongside, the OOM killer is confirmed.
stats — Real-Time Resources
stats shows per-container CPU/memory/network/IO in real time.
docker stats
docker stats <container> --no-stream
--no-stream prints once and exits. Useful for CI/monitoring scripts. If memory usage is approaching the limit, it is an OOM precursor.
top — Processes Inside the Container
docker top shows the process list inside a container.
docker top <container>
docker top <container> auxf # Tree format
If PID 1 shows as sh -c ..., suspect the signal delivery issue discussed in Part 12. If there are more child processes than expected, also check for fork-related issues.
exec — Getting Inside the Container
Probably the most-used diagnostic command.
docker exec -it <container> sh
docker exec -it <container> bash # For images that have bash
docker exec <container> cat /proc/1/status
docker exec <container> env
For distroless or other shell-less images, you cannot attach a shell via exec. In that case, spin up a separate network debug container and attach it to the same network namespace:
docker run -it --rm \
--network container:<container> \
--pid container:<container> \
nicolaka/netshoot
nicolaka/netshoot is a debugging image packed with tools like curl, dig, tcpdump, and strace. Because it shares the network/PID namespace, you can inspect the target container’s ports and processes directly.
Common Error Patterns and Solutions
1. permission denied — Volume Permissions
open /app/logs/app.log: permission denied
Cause: A host directory is bind-mounted, but the non-root user UID inside the container does not match the host file owner UID.
Solution:
- Match the container’s UID to the host directory owner (
--user 1000:1000) - Or use a named volume (Docker handles permissions)
- Or set
RUN chown -Rfollowed byUSERduring image build
2. Error response from daemon: pull access denied
pull access denied for registry.example.com/myapp
Cause: Not logged into the registry, token expired, or repository path typo.
Solution:
docker login registry.example.com
docker pull registry.example.com/myapp:1.4.2
For ECR, the token is 12-hour, so it must be refreshed at the start of each CI run.
3. address already in use
bind: address already in use
Cause: The port on the host is already being used by another process.
Solution:
lsof -i :8080 # Linux/macOS
# or
netstat -anp | grep 8080
# Find conflicting container
docker ps --filter "publish=8080"
4. no space left on device
Cause: The host disk is full, or Docker’s internal storage directory (/var/lib/docker) is full.
# Cleanup
docker system df # Check capacity
docker system prune # Clean stopped containers, unused networks/images
docker system prune -a --volumes # Above + unreferenced images + volumes (caution!)
docker builder prune # Clean build cache only
A CI job that periodically runs builder prune keeps things stable over time.
5. CrashLoopBackOff (Kubernetes)
The container exits immediately after starting. kubelet keeps restarting it while the backoff grows exponentially.
kubectl describe pod <pod> # Events, last exit code
kubectl logs <pod> -c <container>
kubectl logs <pod> -c <container> --previous # Previous instance logs
--previous is the decisive flag. The current container is already dead with empty logs, but the previous instance’s logs reveal why it died.
Docker Alternatives
Docker is the de facto standard, but it is not the only option. Understanding why alternatives exist enables informed choices for each situation.
flowchart TB
subgraph OCI["OCI Standards"]
OCI_SPEC["OCI Image Spec<br/>OCI Runtime Spec<br/>OCI Distribution Spec"]
end
subgraph TOOLS["High-level tools"]
DOCKER["Docker (dockerd)"]
PODMAN["Podman"]
NERDCTL["nerdctl"]
end
subgraph RUNTIME["Low-level runtimes"]
CONTAINERD["containerd"]
CRIO["CRI-O"]
RUNC["runc"]
end
DOCKER --> CONTAINERD
NERDCTL --> CONTAINERD
PODMAN --> RUNC
CONTAINERD --> RUNC
CRIO --> RUNC
OCI_SPEC -.standard.-> DOCKER
OCI_SPEC -.standard.-> PODMAN
OCI_SPEC -.standard.-> CONTAINERD
The key takeaway from this diagram is that images and runtimes are standardized. Images built with Docker can run on Podman or containerd as-is. Even if you switch tools, images remain compatible.
Podman — Daemonless + Rootless
Podman is an alternative led by Red Hat. Command compatibility is high (alias docker=podman works in many cases), with two key differences:
- No daemon: Docker has the always-running dockerd daemon, with the CLI calling it. Podman has each command directly starting containers. It is friendly with SystemD (
podman generate systemd). - Rootless by default: Running
podmanas a regular user leverages user namespaces to run containers without root. This reduces the attack surface.
# Most Docker commands work as-is
podman pull alpine:3.20
podman run -it alpine:3.20 sh
podman ps
podman build -t myapp:1.4.2 .
# Auto-generate SystemD units
podman generate systemd --name myapp > myapp.service
However, Compose compatibility is less smooth. A separate podman-compose tool exists, and recently podman compose has been integrated, but it is not perfectly compatible.
When to choose Podman:
- Security environments requiring rootless (government, finance, etc.)
- When aligning with Red Hat-based OS standard tooling
- When managing containers as services via SystemD
containerd — Kubernetes Standard Runtime
containerd is a low-level runtime originally separated from Docker. Docker uses containerd, and Kubernetes also uses containerd as its default CRI. Since Kubernetes 1.24 removed Docker Shim, node runtimes are mostly containerd or CRI-O.
# containerd uses nerdctl instead of its own CLI
nerdctl pull alpine:3.20
nerdctl run -it alpine:3.20 sh
nerdctl build -t myapp:1.4.2 .
nerdctl compose up -d
nerdctl provides an interface nearly identical to the Docker CLI. With BuildKit integration, Compose support, and even image encryption, it is essentially “Docker without the Docker CLI.”
When you encounter containerd:
- When working with Kubernetes nodes (you are likely already using it)
- When checking pods/containers at the node level with
crictl
# On a Kubernetes node
crictl ps
crictl logs <container ID>
crictl exec -it <container ID> sh
CRI-O
CRI-O is a Kubernetes-only runtime. OpenShift uses it as the default. Its philosophy is to deliberately leave out features not needed by Kubernetes, making it simple and lightweight. Regular developers rarely use it directly — it falls under the platform team’s scope.
Migration Considerations
Main things to check when moving from Docker to Podman or containerd-based setups:
- Compose compatibility:
docker composefiles may not run as-is on Podman. Latest features likedepends_on.condition,healthcheck, andprofilesvary by tool version - Docker socket dependency: If CI runners or observability tools depend on
/var/run/docker.sock, you need to switch to Podman’spodman.sockor containerd’scontainerd.sock - User namespace mapping: In rootless mode, UID mapping differs, potentially introducing new volume permission issues
- Build commands: Just as
docker buildis BuildKit-based, Podman’sbuildahand nerdctl’s BuildKit integration are similar but not perfectly compatible - Images are compatible: As shown in the diagram above, images follow the OCI standard and work across tools. This is not the biggest concern
Wrapping Up the Docker Series
This concludes the 13-part journey. Starting from what a container is in Part 1, through images/networking/volumes fundamentals, Dockerfile, Compose, security and optimization, to production operations. Each part builds the foundation for the next. The healthcheck used in Part 7 Compose extends to probes in Part 12, and the secrets from Part 10 security are implemented with --mount=type=secret in Part 11 BuildKit.
Docker is not just a single tool — it is the gateway to the container ecosystem. The instincts built here apply whether you go to Kubernetes, Podman, or containerd. Images and namespaces, control groups and runtimes — the same building blocks, just different packaging.
The next step from this series depends on your needs. If orchestration is needed, go to Kubernetes. If security is the focus, explore Podman and image signing. For build optimization, dig deeper into BuildKit (remote builders, distributed caching, etc.). Whichever direction you take, the foundation built across these 13 parts will serve as the background.
To review the Docker series from the beginning, go back to Part 1. Reading from the start about why containers exist — with the pieces you have now built up — might reveal how they all fit together in a new light.

Loading comments...