Table of contents
- Pods Disappear
- Why Separate PV and PVC
- Static Provisioning — Admin Prepares in Advance
- Dynamic Provisioning — Created Automatically When Needed
- Storage Binding Flow
- Access Modes — ReadWriteOnce and Friends
- Reclaim Policy — What Happens on Deletion
- The PVC Pattern Created by StatefulSets
- Volume Expansion
- CSI — The Common Language for Storage Plugins
- Lab — Running Nginx with Dynamic Provisioning
Pods Disappear
Everyone gets surprised at some point when starting to learn Kubernetes. You create a file inside a pod, restart the pod, and the file is gone. Container filesystems are disposable by default.
If the application is truly stateless, there’s no problem. But reality is different. Databases need to write data to disk, upload servers need to keep files, and cache servers want to maintain their warmed-up state. Data must survive even when pods die.
The abstraction Kubernetes created to solve this problem is PersistentVolume (PV) and PersistentVolumeClaim (PVC). The names are long and intimidating at first, but the structure is simple.
Why Separate PV and PVC
It might seem easier to connect storage directly to a pod, but Kubernetes deliberately split it into two steps. Why?
The key is separation of concerns. The roles of “the side that creates storage” and “the side that uses it” are different.
- PV (PersistentVolume): Actual storage prepared by the cluster administrator. An abstraction that lets Kubernetes handle physical resources like AWS EBS volumes, NFS shared directories, or local disks
- PVC (PersistentVolumeClaim): A request from an application developer saying “I need 20Gi of read-write storage”
When you create a PVC, Kubernetes finds a PV that matches the conditions and binds them. Since pods only look at the PVC, they don’t need to care whether the underlying storage is EBS or NFS.
flowchart LR
A[Cluster Admin] -->|Prepare| B[PV<br/>50Gi EBS]
C[Developer] -->|Request| D[PVC<br/>20Gi request]
D -->|Bind| B
E[Pod] -->|Mount| D
Thanks to this separation, application manifests stay the same even when infrastructure changes. Moving from on-premises to AWS doesn’t change the PVC reference in your pod spec.
Static Provisioning — Admin Prepares in Advance
This is the most basic approach. The admin creates PVs beforehand, and developers request them via PVCs.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-manual
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: manual
hostPath:
path: /mnt/data
This declares a local directory (/mnt/data) as a 10Gi PV. hostPath is for demo purposes; in production you’d use NFS or cloud volumes.
Next, request this PV with a PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-manual
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: manual
When you deploy the PVC, Kubernetes looks for a PV matching the conditions (capacity >= 5Gi, matching access mode, matching StorageClass). The pv-manual we just created fits, so it gets bound. Running kubectl get pvc shows the status changed to Bound.
Finally, mount the PVC to a pod:
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
containers:
- name: app
image: nginx:1.27
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-manual
Now data written to /usr/share/nginx/html is stored in the PV. Data survives even when the pod dies.
Dynamic Provisioning — Created Automatically When Needed
Static provisioning is clear but inconvenient. Every time a dev team creates a PVC, an admin must manually prepare a PV — that doesn’t scale. So what’s used almost exclusively in practice is dynamic provisioning.
The key is a resource called StorageClass. It’s a template that says “when this kind of storage is needed, ask this provisioner to create it.”
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
This is a StorageClass for creating gp3 EBS volumes on AWS. The provisioner is the CSI driver name that actually creates volumes, and parameters are the settings passed to that driver.
Now you just specify storageClassName: fast-ssd in the PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-dynamic
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: fast-ssd
No need to create a PV in advance. The moment the PVC is created, the provisioner referenced by the StorageClass is asked to “create a 20Gi EBS volume.” The returned volume is registered as a PV and bound to the PVC. From the developer’s perspective, just creating a PVC produces a disk.
Dynamic provisioning is set up as the default in most managed Kubernetes (EKS, GKE, AKS). If you don’t specify a storageClassName, the cluster’s default StorageClass is used.
Storage Binding Flow
Let’s lay out how PV, PVC, and Pod mesh together:
sequenceDiagram
participant D as Developer
participant K as kube-apiserver
participant C as External Provisioner<br/>(CSI Driver)
participant S as Storage Backend<br/>(EBS, NFS, ...)
participant P as Pod
D->>K: Create PVC
K->>C: "Need volume for this StorageClass"
C->>S: Create actual volume
S-->>C: Volume ID
C->>K: Register PV
K->>K: Bind PV <-> PVC
D->>K: Create Pod (referencing PVC)
K->>P: Schedule Pod
P->>S: Mount volume
One important point: if the StorageClass’s volumeBindingMode is WaitForFirstConsumer, the actual volume isn’t created when you just create the PVC. Only when a pod gets scheduled does the request go out: “create a volume in this node’s AZ.” This design aligns the availability zone between node and volume. In contrast, Immediate creates the volume as soon as the PVC is created.
Access Modes — ReadWriteOnce and Friends
The accessModes you must specify when declaring PV and PVC come in four varieties:
| Mode | Abbreviation | Meaning |
|---|---|---|
| ReadWriteOnce | RWO | Read-write from a single node |
| ReadOnlyMany | ROX | Read-only from multiple nodes |
| ReadWriteMany | RWX | Read-write from multiple nodes |
| ReadWriteOncePod | RWOP | Read-write from a single Pod (GA in v1.29) |
Many people misunderstand ReadWriteOnce as “only one pod can use it.” It actually means only one node can use it. Multiple pods scheduled on the same node can mount it simultaneously. For truly single-pod access, use ReadWriteOncePod.
Access mode selection depends on what the storage backend supports:
- Block storage (EBS, GCE PD, Azure Disk): Only supports RWO. Single node only
- File storage (EFS, NFS, Azure Files): Supports RWX. Concurrent access from multiple nodes
- Object storage (S3): Rarely used as PV; apps typically access directly via SDK
Situations requiring RWX are rarer than you’d think. Most production databases (PostgreSQL, MySQL, Redis) run with RWO. Only consider RWX when file sharing is genuinely needed (legacy upload servers like WordPress, ML dataset sharing).
Reclaim Policy — What Happens on Deletion
When you delete a PVC, what happens to the connected PV and actual storage? The persistentVolumeReclaimPolicy decides:
- Retain: Keeps the PV and actual data. Requires manual cleanup by an admin, but prevents accidental data loss
- Delete: Deletes both the PV and actual storage. The default for dynamic provisioning
- Recycle: Basic deletion (
rm -rf /thevolume/*) followed by reuse. Now deprecated — don’t use it
For PVCs backing important data like production databases, defaulting to Retain is the safe choice. If you accidentally run kubectl delete pvc, at least “the PV is still there” saves you.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: retain-ssd
provisioner: ebs.csi.aws.com
parameters:
type: gp3
reclaimPolicy: Retain # Volume preserved even on deletion
The PVC Pattern Created by StatefulSets
So far we’ve been creating PVCs directly. But for cases like databases where each replica needs its own volume, you use StatefulSet’s volumeClaimTemplates:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres-headless
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:16
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 100Gi
With replicas: 3, PVCs data-postgres-0, data-postgres-1, and data-postgres-2 are automatically created. Each pod mounts only the PVC matching its name. Even if a pod restarts or moves to another node, it finds its own volume again. This is the core feature for stateful workloads.
Volume Expansion
In production, you’ll inevitably run into volumes running out of space. Kubernetes allows PVC size increases for StorageClasses with allowVolumeExpansion: true:
# Edit the existing PVC to increase the storage request
kubectl edit pvc pvc-dynamic
# spec.resources.requests.storage: 20Gi -> 50Gi
If the CSI driver supports online expansion, it grows without a pod restart. However, shrinking is not supported. Once you increase, you can’t go back, so expand carefully.
CSI — The Common Language for Storage Plugins
In the past, storage drivers were built into Kubernetes (in-tree drivers). AWS EBS, GCE PD, Ceph — all were embedded in the Kubernetes codebase. The problem was that adding new storage required updating Kubernetes itself.
That’s why CSI (Container Storage Interface) was introduced. Storage vendors only need to implement the CSI spec to add new storage without touching the Kubernetes core. Today, virtually all storage is provided as CSI drivers.
flowchart TB
A[kube-apiserver] --> B[External Provisioner]
A --> C[External Attacher]
A --> D[External Resizer]
B --> E[CSI Driver<br/>Vendor implementation]
C --> E
D --> E
E --> F[Storage Backend]
Sidecar containers (Provisioner, Attacher, Resizer) subscribe to Kubernetes events and delegate actual work to the CSI driver. Running kubectl get pods -n kube-system | grep csi shows these components running.
Lab — Running Nginx with Dynamic Provisioning
Let’s try this assuming the cluster has a default StorageClass. minikube, kind, EKS, and GKE all come with a default SC.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: web-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 1
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.27
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: web-data
Deploy and check the status:
kubectl apply -f web.yaml
kubectl get pvc
# NAME STATUS VOLUME CAPACITY
# web-data Bound pvc-abc123-... 1Gi
kubectl exec -it deploy/web -- bash -c 'echo "<h1>Hello persistent</h1>" > /usr/share/nginx/html/index.html'
kubectl delete pod -l app=web # Delete only the pod
# Verify data in the recreated pod
kubectl exec -it deploy/web -- cat /usr/share/nginx/html/index.html
# <h1>Hello persistent</h1>
The data survives even after the pod dies and is reborn. This confirms that storage is completely decoupled from the pod’s lifecycle.
In the next part, we tackle the question of “how much CPU and memory should we give a pod.” We’ll look at how requests and limits, QoS classes affect scheduling and OOMKilled, and how to set up autoscaling with HPA.




Loading comments...