Table of contents
- Pods Alone Aren’t Enough
- ReplicaSet: The Simplest Promise of N Pods
- Deployment: The Conductor Above ReplicaSets
- How Rolling Updates Achieve Zero Downtime
- Rollback: Reverting in One Command
- StatefulSet: For Pods That Need Their Own Identity
- DaemonSet: One Per Node
- All Four in One Table
- Choosing the Right One
Pods Alone Aren’t Enough
In Part 3, we ran a pod directly. But creating a pod by hand is something you almost never do in practice. The reason is simple: a pod created that way is gone for good when it dies.
Kubernetes’ real power comes from “controllers that manage pods declaratively.” We simply declare “always keep 3 nginx instances running,” and the controller maintains that state. When a pod dies, it revives it; when a deployment is updated, it gradually swaps old pods for new versions.
In this part, we’ll dissect the four most commonly used controllers. Each solves a different problem, so you need to be able to pick the right tool for the situation.
ReplicaSet: The Simplest Promise of N Pods
The most basic controller is the ReplicaSet. As the name suggests, it’s a “set of pod replicas.” A ReplicaSet guarantees exactly one thing: there are always N pods with the labels I manage.
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-rs
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.25
Creating this ReplicaSet brings up 3 pods with the app: nginx label. Even if you manually delete one, a new pod quickly fills its place.
kubectl apply -f rs.yaml
kubectl get pods -l app=nginx
kubectl delete pod <pod-name> # Delete one
kubectl get pods -l app=nginx # A new pod appears immediately
However, you rarely use ReplicaSets directly. They lack version update functionality. Even if you change the image tag from 1.25 to 1.26, the ReplicaSet leaves existing pods as-is and does nothing. Its judgment is simply “as long as there are 3, we’re good.”
Deployment: The Conductor Above ReplicaSets
The most used controller in practice is the Deployment. A Deployment is a higher-level controller that manages ReplicaSets and handles version updates and rollbacks.
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: myapp:1.0
ports:
- containerPort: 8080
The hierarchy that a Deployment creates looks like this:
flowchart TB
D[Deployment: web]
RS1[ReplicaSet: web-abc123<br/>image: myapp:1.0]
P1[Pod 1]
P2[Pod 2]
P3[Pod 3]
D --> RS1
RS1 --> P1
RS1 --> P2
RS1 --> P3
Users only interact with the Deployment. The ReplicaSet and Pods are managed automatically by the Deployment.
How Rolling Updates Achieve Zero Downtime
Let’s upgrade the image version from 1.0 to 2.0:
kubectl set image deployment/web web=myapp:2.0
Or you can modify the YAML and apply it with kubectl apply -f. Let’s trace what happens after the Deployment detects the change:
sequenceDiagram
participant D as Deployment
participant OLD as ReplicaSet v1 (3)
participant NEW as ReplicaSet v2 (0)
D->>NEW: Create ReplicaSet (replicas=0)
D->>NEW: Scale up to replicas=1
Note over NEW: v2 pod 1 up, readiness passed
D->>OLD: Scale down to replicas=2
Note over OLD: 1 v1 pod removed
D->>NEW: Scale up to replicas=2
D->>OLD: Scale down to replicas=1
D->>NEW: Scale up to replicas=3
D->>OLD: Scale down to replicas=0
Note over OLD: All v1 pods removed
The Deployment creates a new ReplicaSet for the new version, gradually increases its replicas while simultaneously reducing the old ReplicaSet. This process is the RollingUpdate strategy.
maxUnavailable: 1 means “during the update, at most 1 pod can be down,” and maxSurge: 1 means “up to 1 extra pod beyond the desired count can exist temporarily.” So with a base of 3 pods, at least 2 must always be in a serviceable state, and at most 4 can temporarily exist. These constraints are what create zero downtime.
A key point is that readiness probes must be properly configured (readiness probe — a health check that verifies whether a pod is ready to receive traffic; until it passes, the pod is excluded from the Service). New pods won’t trigger the removal of old pods until they pass readiness. This prevents the accident of traffic hitting a pod that’s still initializing. Without readiness, you’ll often see error rates spike momentarily during rolling updates.
Rollback: Reverting in One Command
If something seems off after a deployment, the Deployment doesn’t delete the previous ReplicaSet — it just scales it down to replicas=0. This allows you to revert with a single command:
kubectl rollout undo deployment/web # Revert to previous version
kubectl rollout undo deployment/web --to-revision=3 # Revert to a specific revision
You can also check the revision history:
kubectl rollout history deployment/web
kubectl rollout history deployment/web --revision=3
How many revisions to keep is controlled by spec.revisionHistoryLimit. The default is 10, but many teams lower it to 3-5 to reduce the load on etcd.
When monitoring rollout status, kubectl rollout status deployment/web is useful. It blocks until the update completes, making it great for CI pipelines.
StatefulSet: For Pods That Need Their Own Identity
Deployments assume all pods are identical. Whether a pod is named web-abc or web-xyz doesn’t matter. But some workloads require each pod to have a unique identity. The classic example is database clusters.
Systems like MySQL, MongoDB, Kafka, and Elasticsearch require:
- Stable names: Sequential, predictable names like
mysql-0,mysql-1,mysql-2 - Stable storage: Even after a pod restarts, it continues using its own volume.
mysql-0always uses thedata-mysql-0volume - Ordered deployment: Pod 0 must be up before pod 1, pod 1 before pod 2. Needed for things like master replication setup
The StatefulSet fulfills these requirements.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
This StatefulSet creates mysql-0, mysql-1, mysql-2 pods in order, and connects data-mysql-0, data-mysql-1, data-mysql-2 volumes to each pod one-to-one. Even if mysql-1 dies and comes back, it uses the same name and same volume.
Operationally, StatefulSets are more demanding than Deployments. Rolling updates proceed one at a time in order, and there are many considerations when scaling up or down. The general rule is: Deployment for stateless apps, StatefulSet for stateful apps.
Let’s visualize the structure a StatefulSet creates. The key is that each pod has a unique name and a one-to-one PVC (PersistentVolumeClaim — a resource where a pod requests “I need this much persistent storage.” Covered in detail in Part 8):
flowchart TB
SS["StatefulSet: mysql\n(replicas: 3)"]
P0["Pod: mysql-0"]
P1["Pod: mysql-1"]
P2["Pod: mysql-2"]
V0[("PVC: data-mysql-0")]
V1[("PVC: data-mysql-1")]
V2[("PVC: data-mysql-2")]
SS -->|Created in order| P0
SS -->|After P0 Ready| P1
SS -->|After P1 Ready| P2
P0 -.->|Fixed binding| V0
P1 -.->|Fixed binding| V1
P2 -.->|Fixed binding| V2
DaemonSet: One Per Node
Some pods “must run exactly one on every node.” Typical examples include:
- Log collectors (Fluent Bit, Filebeat): Need to be on every node to read log files
- Monitoring agents (Node Exporter): Need to collect metrics from each node
- Network plugins (CNI agents): Manage network configuration on nodes
If you tried this with a Deployment, pods might not be evenly distributed across nodes. That’s why there’s a separate controller called DaemonSet.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostNetwork: true
containers:
- name: node-exporter
image: prom/node-exporter:latest
ports:
- containerPort: 9100
hostPort: 9100
When a DaemonSet is deployed, one of these pods appears on every node in the cluster. When a new node is added, a pod automatically appears on it. When a node is removed, its pod disappears with it. Kubernetes maintains a 1:1 relationship per node.
Let’s compare in a diagram the difference between how a Deployment distributes pods arbitrarily across nodes versus how a DaemonSet places exactly one per node:
flowchart LR
subgraph DEP["Deployment (replicas: 2)"]
DN1["Node 1\n[pod-a, pod-b]"]
DN2["Node 2\n(empty)"]
DN3["Node 3\n(empty)"]
end
subgraph DS["DaemonSet"]
SN1["Node 1\n[node-exporter]"]
SN2["Node 2\n[node-exporter]"]
SN3["Node 3\n[node-exporter]"]
end
If you want to run on only a subset of nodes, you can set conditions using spec.template.spec.nodeSelector or taint/tolerations.
All Four in One Table
Summarizing the differences between the four controllers in a table makes the picture clearer:
| Controller | What It Manages | Pod Identity | Storage | Primary Use Case |
|---|---|---|---|---|
| ReplicaSet | Maintains N replicas | Anonymous | Shared or none | Used by Deployment internally |
| Deployment | ReplicaSet + rolling updates | Anonymous | Shared or none | Stateless apps (APIs, web) |
| StatefulSet | Ordered pod set | Unique names | Per-pod independent volumes | Databases, distributed storage |
| DaemonSet | 1 pod per node | Node-based | Node-local | Log/monitoring agents |
There are also controllers like Job (one-time batch) and CronJob (periodic batch), but at the beginner stage, it’s best to thoroughly understand these four first.
Choosing the Right One
In practice, the decision flow when choosing a controller roughly goes like this:
- Do you need a pod on every node? -> DaemonSet
- Does each pod need a unique identity and dedicated storage? -> StatefulSet
- Is it a one-off task? -> Job / CronJob
- Most other cases -> Deployment
Most applications fall into category 4. Push state to an external DB, keep pods as stateless processing units, and manage them with Deployments — that’s the Kubernetes-friendly baseline.
In the next part, we’ll look at Services and networking — what makes pods discoverable from the outside and from each other. We’ll cover why you shouldn’t use pod IPs directly, and what ClusterIP, NodePort, and LoadBalancer each solve.




Loading comments...