Skip to content
ioob.dev
Go back

Docker for Beginners Part 4 — Container Lifecycle

· 7 min read
Docker Series (4/13)
  1. Docker for Beginners Part 1 — What Is Docker
  2. Docker for Beginners Part 2 — Images and Layers
  3. Docker for Beginners Part 3 — Writing a Dockerfile
  4. Docker for Beginners Part 4 — Container Lifecycle
  5. Docker for Beginners Part 5 — Volumes and Data Persistence
  6. Docker for Beginners Part 6 — Networking
  7. Docker Part 7 — Multi-Container Orchestration with Docker Compose
  8. Docker Part 8 — Slimming Images with Multi-Stage Builds
  9. Docker Part 9 — Registry: Where Do Images Live?
  10. Docker Part 10 — Container Security: Blocking Issues Before They Blow Up
  11. Docker Part 11 — BuildKit and Advanced Builds
  12. Docker Part 12 — Production Best Practices
  13. Docker Part 13 — Troubleshooting and Alternatives
Table of contents

Table of contents

A Container Is a Process

An image is a still photo, and a container is a moving picture. A movie starts, pauses, and ends. Without understanding this lifecycle, you will encounter strange problems in operations: “I restarted the container but forgot -it and it immediately went to Exited,” “the server rebooted and the containers did not come back up,” “my deploy script keeps triggering SIGKILL.” All of these come from an insufficient understanding of the lifecycle.

In this part, we follow the entire journey of a container from creation to destruction.

State Transition Diagram

Let’s start with the state transitions at a glance:

stateDiagram-v2
    [*] --> created : docker create
    [*] --> running : docker run
    created --> running : docker start
    running --> paused : docker pause
    paused --> running : docker unpause
    running --> stopped : docker stop<br/>(SIGTERM → SIGKILL)
    running --> stopped : Process exits normally
    stopped --> running : docker start / restart
    stopped --> [*] : docker rm
    created --> [*] : docker rm

The key takeaways:

docker run — The Most Commonly Typed Command

This is the basic command for spinning up a container. It has many options, but here are the ones you will repeatedly use in practice:

docker run \
  -d \
  --name web \
  --restart unless-stopped \
  -p 8080:80 \
  -e NODE_ENV=production \
  -v $(pwd)/data:/data \
  --memory 512m --cpus 0.5 \
  nginx:1.27

This single line contains all the important options:

For quick tests, you will also frequently use patterns like these:

# One-off interactive shell, automatically removed on exit
docker run --rm -it ubuntu:24.04 bash

# Share the host network (for debugging)
docker run --rm --network host nginx:1.27

--rm means “automatically delete when finished.” -it is a combination of -i (attach standard input) and -t (allocate a TTY), used when running interactively like a shell.

A Container Is a One-Person Company with PID 1

A container has its own isolated PID namespace, and the specified process runs as PID 1 inside it. In Linux, PID 1 is a special entity:

  1. It must adopt orphaned processes (collect zombies via wait())
  2. Signals are not automatically forwarded. They must be explicitly received and passed to children
  3. Most default signal handlers are disabled. For example, PID 1 ignores SIGTERM by default

If you run a shell script as PID 1 without knowing this, odd things happen. In particular, docker stop will not work and after a 10-second wait, the process gets killed with SIGKILL.

The Shell Form Trap of CMD

Let’s expand on what was briefly mentioned in Part 3. If you write this in your Dockerfile, it becomes a problem:

CMD node server.js   # shell form

Internally, this is executed as /bin/sh -c "node server.js". The process tree looks like this:

PID 1: /bin/sh -c "node server.js"
  └─ PID 7: node server.js

docker stop sends SIGTERM to PID 1. sh receives the signal and ignores it. The Node app never gets the signal, and after 10 seconds it is force-killed with SIGKILL. DB connections are severed and in-flight requests are lost.

Using exec form eliminates this problem:

CMD ["node", "server.js"]   # exec form
PID 1: node server.js   # Runs directly without a shell

Node.js has default signal handlers that respond to SIGTERM by exiting the event loop. Clean shutdown becomes possible.

Init processes like tini or dumb-init

Complex apps sometimes spawn multiple child processes. If PID 1 does not reap zombies, they accumulate over time. In such cases, a lightweight init process is used as PID 1.

# Docker provides a built-in init
# Can also be replaced with the docker run --init option
docker run --init myapp

The --init flag places tini as PID 1 and runs the application underneath it. Tini receives signals, forwards them to children, and reaps zombies. Placing it in front of complex scripts dramatically improves stability.

Handling SIGTERM in Your App

When Docker stops a container, it follows this sequence:

  1. Send SIGTERM to PID 1
  2. Wait 10 seconds by default (--stop-timeout to adjust)
  3. If still alive, force-kill with SIGKILL

This “10-second grace period” is the key. The app must cleanly finish within this time. Here is an example of attaching a handler in Node.js:

// Graceful shutdown for an Express app
const server = app.listen(3000);

function shutdown(signal) {
  console.log(`${signal} received, shutting down...`);
  server.close(() => {
    console.log('HTTP server closed');
    // Clean up DB connections, queue consumers, etc.
    process.exit(0);
  });

  // Failsafe timeout for forced exit
  setTimeout(() => {
    console.error('Forced exit');
    process.exit(1);
  }, 9000);
}

process.on('SIGTERM', () => shutdown('SIGTERM'));
process.on('SIGINT', () => shutdown('SIGINT'));

server.close() stops accepting new connections and invokes the callback once all in-flight requests are done. The 9-second timer ensures the app finishes cleanly before Docker’s 10-second timeout kicks in.

Other languages like Go and Java follow the same pattern: receive signal → block new requests → complete in-flight requests → clean up resources → exit.

start, stop, restart, rm — Commands That Move State

These are the commands for manipulating a container once it has been created:

# Stop — SIGTERM → 10-second wait → SIGKILL
docker stop web
docker stop --time=30 web   # Wait up to 30 seconds

# Start again
docker start web

# Stop then start (stop + start)
docker restart web

# Force immediate termination (SIGKILL)
docker kill web
docker kill --signal=SIGUSR1 web   # Custom signals are also possible

# Remove (must be in stopped state)
docker rm web

# Force-remove a running container
docker rm -f web

Know the difference between docker stop and docker kill. stop gives a grace period before killing. kill terminates immediately. For routine redeployments, use stop. Use kill only when a process is stuck or unresponsive.

docker exec — Getting Inside a Running Container

Use this when you want to enter an already running container and execute commands. It is the most frequently used debugging command.

# Attach a shell
docker exec -it web bash
# If the image has no bash
docker exec -it web sh

# Run a command once
docker exec web printenv NODE_ENV

# Enter as root (when the image specifies a USER)
docker exec -u root -it web bash

Inside the container, you typically run network tests, check logs, and inspect disk status. Note that packages installed or files changed via exec survive container restarts but disappear when the container is recreated (deleted then re-run). Operational changes should be reflected in images or volumes.

logs and inspect — Looking Inside the State

To see why a container died or what is happening right now:

# View logs
docker logs web
docker logs -f web              # Stream like tail -f
docker logs --tail 100 web      # Last 100 lines
docker logs --since 10m web     # Last 10 minutes
docker logs -t web              # Include timestamps

# Full container metadata (JSON)
docker inspect web

# Extract specific fields
docker inspect --format '{{.State.Status}}' web
docker inspect --format '{{.State.ExitCode}}' web
docker inspect --format '{{.RestartCount}}' web

The JSON output from docker inspect is massive, so using --format to extract only the fields you need is the practical approach. State.Status, State.ExitCode, and RestartCount are commonly checked in operations.

To view container resource usage in real time:

docker stats
# CONTAINER ID   NAME   CPU %   MEM USAGE / LIMIT   MEM %   NET I/O   BLOCK I/O

It is a tool with a similar feel to top.

Restart Policy

This option determines whether a container is automatically restarted when it exits. Specified with the --restart flag.

PolicyBehavior
no (default)No automatic restart
on-failure[:N]Restart only on non-zero exit codes. N is the max retry count
alwaysRestart regardless of reason. Also auto-starts when Docker Daemon starts
unless-stoppedSimilar to always, but does not restart if the user explicitly ran docker stop
docker run -d --restart unless-stopped --name web nginx:1.27

The most commonly used in practice is unless-stopped. It auto-recovers from abnormal exits but stays stopped when you intentionally stop it. It also comes back up automatically after a server reboot.

Using on-failure:3 means the app will restart up to 3 times when it crashes with an error. If it still fails after the third try, it gives up. This prevents a crash loop from running indefinitely and burning resources.

If you are using Docker Compose or Kubernetes, restart policies are typically left to the orchestrator. Docker Engine-level restart policies are mainly for single-node operations.

Tracking “Why It Died” via Exit Code

When a container exits, an exit code is left behind. This number is the first clue for diagnosing the cause.

docker ps -a --filter "name=web" --format "{{.Names}}\t{{.Status}}"
# web   Exited (137) 2 seconds ago
Exit CodeMeaning
0Normal exit
1General error (app called exit(1) or threw an exception)
125docker run itself failed (e.g., image not found)
126Command in container is not executable (e.g., permission denied)
127Command in container not found (not in PATH)
137Killed by SIGKILL (9 + 128). docker kill or OOM
139SIGSEGV segfault (11 + 128)
143Killed by SIGTERM (15 + 128). Normal path of docker stop

137 is the most commonly encountered mystery. Usually it means the OOM Killer terminated the container for exceeding the memory limit, someone ran docker kill, or Kubernetes killed it due to a failed livenessProbe. Check dmesg or use docker inspect --format '{{.State.OOMKilled}}' web to confirm OOM.

Operational Flow at a Glance

Collecting the frequently encountered flows in practice into a single diagram looks like this:

flowchart TB
    BUILD["docker build -t app:1.2.3 ."] --> PUSH["docker push registry/app:1.2.3"]
    PUSH --> PULL["docker pull on server"]
    PULL --> STOP["docker stop old-app (graceful)"]
    STOP --> RM["docker rm old-app"]
    RM --> RUN["docker run -d --name app --restart unless-stopped ..."]
    RUN --> HEALTH{"HEALTHCHECK OK?"}
    HEALTH -->|Yes| DONE["Deployment complete"]
    HEALTH -->|No| LOGS["docker logs app<br/>Investigate cause"]
    LOGS --> ROLLBACK["Rollback: docker run with previous tag"]

On a single server, this flow can be wrapped up in a single shell script. When servers scale to multiple machines, you move to Compose, Swarm, or Kubernetes. Even then, the lifecycle of each individual container still follows these rules.

Five Principles to Always Keep in Mind

Finally, principles you should not forget in practice:

  1. Be mindful of PID 1. Write CMD in exec form, and use --init or tini when needed
  2. Catch SIGTERM in your app. Zero-downtime deployment is impossible without graceful shutdown
  3. unless-stopped is a safe default for restart policies. Adjust to on-failure as needed
  4. Trace causes via exit codes and logs. If you see 137, suspect OOM first
  5. Even Exited containers consume disk. Periodically clean up with docker container prune

In the next part, we talk about data that must survive even when a container dies. The difference between bind mounts and named volumes, the role of tmpfs, volume backup and restore, and the surprisingly common UID/GID permission issues.

Part 5: Volumes and Data Persistence


Related Posts

Share this post on:

Comments

Loading comments...


Previous Post
Docker for Beginners Part 3 — Writing a Dockerfile
Next Post
Docker for Beginners Part 5 — Volumes and Data Persistence