Kubernetes Beginner Series 4 — Controllers

Pods Alone Aren’t Enough
ReplicaSet: The Simplest Promise of N Pods
Deployment: The Conductor Above ReplicaSets
How Rolling Updates Achieve Zero Downtime
Rollback: Reverting in One Command
StatefulSet: For Pods That Need Their Own Identity
DaemonSet: One Per Node
All Four in One Table
Choosing the Right One

Pods Alone Aren’t Enough

In Part 3, we ran a pod directly. But creating a pod by hand is something you almost never do in practice. The reason is simple: a pod created that way is gone for good when it dies.

Kubernetes’ real power comes from “controllers that manage pods declaratively.” We simply declare “always keep 3 nginx instances running,” and the controller maintains that state. When a pod dies, it revives it; when a deployment is updated, it gradually swaps old pods for new versions.

In this part, we’ll dissect the four most commonly used controllers. Each solves a different problem, so you need to be able to pick the right tool for the situation.

ReplicaSet: The Simplest Promise of N Pods

The most basic controller is the ReplicaSet. As the name suggests, it’s a “set of pod replicas.” A ReplicaSet guarantees exactly one thing: there are always N pods with the labels I manage.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-rs
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.25

Creating this ReplicaSet brings up 3 pods with the app: nginx label. Even if you manually delete one, a new pod quickly fills its place.

kubectl apply -f rs.yaml
kubectl get pods -l app=nginx
kubectl delete pod <pod-name>  # Delete one
kubectl get pods -l app=nginx  # A new pod appears immediately

However, you rarely use ReplicaSets directly. They lack version update functionality. Even if you change the image tag from 1.25 to 1.26, the ReplicaSet leaves existing pods as-is and does nothing. Its judgment is simply “as long as there are 3, we’re good.”

Deployment: The Conductor Above ReplicaSets

The most used controller in practice is the Deployment. A Deployment is a higher-level controller that manages ReplicaSets and handles version updates and rollbacks.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: myapp:1.0
        ports:
        - containerPort: 8080

The hierarchy that a Deployment creates looks like this:

flowchart TB
    D[Deployment: web]
    RS1[ReplicaSet: web-abc123<br/>image: myapp:1.0]
    P1[Pod 1]
    P2[Pod 2]
    P3[Pod 3]

    D --> RS1
    RS1 --> P1
    RS1 --> P2
    RS1 --> P3

Users only interact with the Deployment. The ReplicaSet and Pods are managed automatically by the Deployment.

How Rolling Updates Achieve Zero Downtime

Let’s upgrade the image version from 1.0 to 2.0:

kubectl set image deployment/web web=myapp:2.0

Or you can modify the YAML and apply it with kubectl apply -f. Let’s trace what happens after the Deployment detects the change:

sequenceDiagram
    participant D as Deployment
    participant OLD as ReplicaSet v1 (3)
    participant NEW as ReplicaSet v2 (0)

    D->>NEW: Create ReplicaSet (replicas=0)
    D->>NEW: Scale up to replicas=1
    Note over NEW: v2 pod 1 up, readiness passed
    D->>OLD: Scale down to replicas=2
    Note over OLD: 1 v1 pod removed
    D->>NEW: Scale up to replicas=2
    D->>OLD: Scale down to replicas=1
    D->>NEW: Scale up to replicas=3
    D->>OLD: Scale down to replicas=0
    Note over OLD: All v1 pods removed

The Deployment creates a new ReplicaSet for the new version, gradually increases its replicas while simultaneously reducing the old ReplicaSet. This process is the RollingUpdate strategy.

maxUnavailable: 1 means “during the update, at most 1 pod can be down,” and maxSurge: 1 means “up to 1 extra pod beyond the desired count can exist temporarily.” So with a base of 3 pods, at least 2 must always be in a serviceable state, and at most 4 can temporarily exist. These constraints are what create zero downtime.

A key point is that readiness probes must be properly configured (readiness probe — a health check that verifies whether a pod is ready to receive traffic; until it passes, the pod is excluded from the Service). New pods won’t trigger the removal of old pods until they pass readiness. This prevents the accident of traffic hitting a pod that’s still initializing. Without readiness, you’ll often see error rates spike momentarily during rolling updates.

Rollback: Reverting in One Command

If something seems off after a deployment, the Deployment doesn’t delete the previous ReplicaSet — it just scales it down to replicas=0. This allows you to revert with a single command:

kubectl rollout undo deployment/web            # Revert to previous version
kubectl rollout undo deployment/web --to-revision=3  # Revert to a specific revision

You can also check the revision history:

kubectl rollout history deployment/web
kubectl rollout history deployment/web --revision=3

How many revisions to keep is controlled by spec.revisionHistoryLimit. The default is 10, but many teams lower it to 3-5 to reduce the load on etcd.

When monitoring rollout status, kubectl rollout status deployment/web is useful. It blocks until the update completes, making it great for CI pipelines.

StatefulSet: For Pods That Need Their Own Identity

Deployments assume all pods are identical. Whether a pod is named web-abc or web-xyz doesn’t matter. But some workloads require each pod to have a unique identity. The classic example is database clusters.

Systems like MySQL, MongoDB, Kafka, and Elasticsearch require:

Stable names: Sequential, predictable names like mysql-0, mysql-1, mysql-2
Stable storage: Even after a pod restarts, it continues using its own volume. mysql-0 always uses the data-mysql-0 volume
Ordered deployment: Pod 0 must be up before pod 1, pod 1 before pod 2. Needed for things like master replication setup

The StatefulSet fulfills these requirements.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: "mysql"
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

This StatefulSet creates mysql-0, mysql-1, mysql-2 pods in order, and connects data-mysql-0, data-mysql-1, data-mysql-2 volumes to each pod one-to-one. Even if mysql-1 dies and comes back, it uses the same name and same volume.

Operationally, StatefulSets are more demanding than Deployments. Rolling updates proceed one at a time in order, and there are many considerations when scaling up or down. The general rule is: Deployment for stateless apps, StatefulSet for stateful apps.

Let’s visualize the structure a StatefulSet creates. The key is that each pod has a unique name and a one-to-one PVC (PersistentVolumeClaim — a resource where a pod requests “I need this much persistent storage.” Covered in detail in Part 8):

flowchart TB
    SS["StatefulSet: mysql\n(replicas: 3)"]
    P0["Pod: mysql-0"]
    P1["Pod: mysql-1"]
    P2["Pod: mysql-2"]
    V0[("PVC: data-mysql-0")]
    V1[("PVC: data-mysql-1")]
    V2[("PVC: data-mysql-2")]
    SS -->|Created in order| P0
    SS -->|After P0 Ready| P1
    SS -->|After P1 Ready| P2
    P0 -.->|Fixed binding| V0
    P1 -.->|Fixed binding| V1
    P2 -.->|Fixed binding| V2

DaemonSet: One Per Node

Some pods “must run exactly one on every node.” Typical examples include:

Log collectors (Fluent Bit, Filebeat): Need to be on every node to read log files
Monitoring agents (Node Exporter): Need to collect metrics from each node
Network plugins (CNI agents): Manage network configuration on nodes

If you tried this with a Deployment, pods might not be evenly distributed across nodes. That’s why there’s a separate controller called DaemonSet.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostNetwork: true
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest
        ports:
        - containerPort: 9100
          hostPort: 9100

When a DaemonSet is deployed, one of these pods appears on every node in the cluster. When a new node is added, a pod automatically appears on it. When a node is removed, its pod disappears with it. Kubernetes maintains a 1:1 relationship per node.

Let’s compare in a diagram the difference between how a Deployment distributes pods arbitrarily across nodes versus how a DaemonSet places exactly one per node:

flowchart LR
    subgraph DEP["Deployment (replicas: 2)"]
        DN1["Node 1\n[pod-a, pod-b]"]
        DN2["Node 2\n(empty)"]
        DN3["Node 3\n(empty)"]
    end
    subgraph DS["DaemonSet"]
        SN1["Node 1\n[node-exporter]"]
        SN2["Node 2\n[node-exporter]"]
        SN3["Node 3\n[node-exporter]"]
    end

If you want to run on only a subset of nodes, you can set conditions using spec.template.spec.nodeSelector or taint/tolerations.

All Four in One Table

Summarizing the differences between the four controllers in a table makes the picture clearer:

Controller	What It Manages	Pod Identity	Storage	Primary Use Case
ReplicaSet	Maintains N replicas	Anonymous	Shared or none	Used by Deployment internally
Deployment	ReplicaSet + rolling updates	Anonymous	Shared or none	Stateless apps (APIs, web)
StatefulSet	Ordered pod set	Unique names	Per-pod independent volumes	Databases, distributed storage
DaemonSet	1 pod per node	Node-based	Node-local	Log/monitoring agents

There are also controllers like Job (one-time batch) and CronJob (periodic batch), but at the beginner stage, it’s best to thoroughly understand these four first.

Choosing the Right One

In practice, the decision flow when choosing a controller roughly goes like this:

Do you need a pod on every node? -> DaemonSet
Does each pod need a unique identity and dedicated storage? -> StatefulSet
Is it a one-off task? -> Job / CronJob
Most other cases -> Deployment

Most applications fall into category 4. Push state to an external DB, keep pods as stateless processing units, and manage them with Deployments — that’s the Kubernetes-friendly baseline.

In the next part, we’ll look at Services and networking — what makes pods discoverable from the outside and from each other. We’ll cover why you shouldn’t use pod IPs directly, and what ClusterIP, NodePort, and LoadBalancer each solve.

-> Part 5: Services and Networking