Kubernetes: PersistentVolume and PersistentVolumeClaim — an overview with examples
Arseny Zinchenko
Posted on August 5, 2020
Kubernetes: PersistentVolume and PersistentVolumeClaim — an overview with examples
For the persistent data Kubernetes provides two main types of objects — the PersistentVolume and PersistentVolumeClaim.
PersistentVolume — is a storage device and a filesystem volume on it, for example, it could be AWS EBS, which is attached to an AWS EC2, and from the cluster’s perspective of view, a PersistentVolume is a similar resource like let’s say a Kubernetes Worker Node.
PersistentVolumeClaim in its turn is a request to use such a PersistentVolume resource and is similar to a Kubernetes Pod — as a pod is requesting a WorkernNode’s resource, a PersistentVolumeClaim will request resources from a PersistentVolume: as a Pod requesting a CPU, memory from a WorkerNode — a PersistentVolumeClaim will request a necessary storage size and an access type — ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, see the AccessModes.
A PersistentVolume can be created in two ways — a static, and dynamic (recommended one).
When creating a PV statically, you’ll have to create a storage device first, for example, AWS EBS, which will be used by a PersistentVolume.
In case of a cluster wasn’t able to find an appropriate PV for a PersistentVolumeClaim н- it can create a new storage device exactly for this PVC — this will be the dynamic PV creation way.
To make this works a PVC has to have a Storage Class set the same, and this class has to be supported by a cluster.
For example, for the AWS EKS, we have the gp2 StorageClass:
$ kubectl get storageclass
NAME PROVISIONER AGE
gp2 (default) kubernetes.io/aws-ebs 64d
Storage types
For a better understanding of the PersistentVolume concept — let’s see all available storages:
- Node-local storage (
emptyDir
andhostPath
) - Cloud volumes (for example,
awsElasticBlockStore
,gcePersistentDisk
, andazureDiskVolume
) - File-sharing volumes, such as Network File System
- Distributed-file systems (for example, CephFS, RBD, and GlusterFS)
- special types such as
PersistentVolumeClaim
,secret
, andgitRepo
emptyDir
and hostPath
are attached to pods directly and can store data only while such a pod is alive, while cloud volumes, NFS, and PersistentVolume are independent of pods and will store data until such a volume will be deleted.
Create a PersistentVolumeClaim
Static PersistentVolume provisioning
Create an EBS
For the Static provisioning first, we need to create a storage device, in this case, it will be AWS EBS, and then we will create a PersistentVolume that will use this EBS.
Create an EBS:
$ aws ec2 --profile arseniy --region us-east-2 create-volume --availability-zone us-east-2a --size 50
{
“AvailabilityZone”: “us-east-2a”,
“CreateTime”: “2020–07–29T13:10:12.000Z”,
“Encrypted”: false,
“Size”: 50,
“SnapshotId”: “”,
“State”: “creating”,
“VolumeId”: “vol-0928650905a2491e2”,
“Iops”: 150,
“Tags”: [],
“VolumeType”: “gp2”
}
Store its ID — “vol-0928650905a2491e2”.
Create a PersistentVolume
Write a manifest file, let’s call it pv-static.yaml
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
storageClassName: gp2
awsElasticBlockStore:
fsType: ext4
volumeID: vol-0928650905a2491e2
Here:
-
capacity
: storage size -
accessModes
: access type, here it is theReadWriteOnce
, which means that this PV can be attached to an only one WorkerNode at the same time -
storageClassName
: storage access, see below -
awsElasticBlockStore
: used device type -
fsType
: a filesystem type to be created on this volume -
volumeID
: an AWS EBS disc ID
Create the PersistentVolume:
$ kubectl apply -f pv-static.yaml
persistentvolume/pv-static created
Check it:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-static 5Gi RWO Retain Available 69s
StorageClass
The storageClassName
parameter will set the storage type.
Both PVC and PV must have the same class, otherwise, a PVC will not find a PV, and STATUS of such a PVC will be Pending.
If a PVC has no StorageClass
set - then a default value will be used:
$ kubectl get storageclass -o wide
NAME PROVISIONER AGE
gp2 (default) kubernetes.io/aws-ebs 65d
During this, if the StorageClass
is not set for a PV - this PV will be crated without class, and our PVC with the default class will not be able to use this PV with the " Cannot bind to requested volume "pvname": storageClassName does not match" error:
…
Events:
Type Reason Age From Message
— — — — — — — — — — — — -
Warning VolumeMismatch 12s (x17 over 4m2s) persistentvolume-controller Cannot bind to requested volume “pvname”: storageClassName does not match
…
See documentation here>>> and here>>>.
Create a PersistentVolumeClaim
Now, we can create a PersistentVolumeClaim which will use the PersistentVolume we’ve created above to the pvc-static.yaml
file:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-static
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
volumeName: pv-static
Create this PVC:
$ kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
Check it:
$ kubectl get pvc pvc-static
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-static Bound pv-static 5Gi RWO gp2 31s
Dynamic PersistentVolume provisioning
The dynamic way to create a PersistentVolume is similar to the static with the only difference that you don’t need to create an AWS EBS and PersistentVolume resources manually — instead, you’ll just create a PersistentVolumeClaim object and Kubernetes will create an EBS via AWS API and will mount to an AWS EC2 which is playing the WorkerNode role in the Kubernetes cluster:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-dynamic
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
Create this PVC:
$ kubectl apply -f pvc-dynamic.yaml
persistentvolumeclaim/pvc-dynamic created
Check it:
$ kubectl get pvc pvc-dynamic
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-dynamic Pending gp2 45s
Okay, but why it’s in the Pending STATUS
? Check its Events:
$ kubectl describe pvc pvc-dynamic
…
Events:
Type Reason Age From Message
— — — — — — — — — — — — -
Normal WaitForFirstConsumer 1s (x4 over 33s) persistentvolume-controller waiting for first consumer to be created before binding
Mounted By: <none>
WaitForFirstConsumer
Let’s see our default StorageClass's setting:
$ kubectl describe sc gp2
Name: gp2
IsDefaultClass: Yes
…
Provisioner: kubernetes.io/aws-ebs
Parameters: fsType=ext4,type=gp2
…
VolumeBindingMode: WaitForFirstConsumer
Events: <none>
Here, the VolumeBindingMode
defines how exactly a PersistentVolume will be created. With the Immediate value such a PV will be created immediately when a requester VPC will appear, but with the WaitForFirstConsumer as in this case - Kubernetes will wait for a first consumer such as a pod, which will request this PV, and then depending on an AvailbiltyZone of a WorkerNode where this pod is running - Kubernetes will create a new PV and an AWS EBS disc.
Now, let’s create pods to consume those volumes.
Using PersistentVolumeClaim in Pods
Dynamic PersistentVolumeClaim
Let’s describe a pod which will use our dynamic PVC:
apiVersion: v1
kind: Pod
metadata:
name: pv-dynamic-pod
spec:
volumes:
- name: pv-dynamic-storage
persistentVolumeClaim:
claimName: pvc-dynamic
containers:
- name: pv-dynamic-container
image: nginx
ports:
- containerPort: 80
name: "nginx"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: pv-dynamic-storage
Here:
-
volumes
: -
persistentVolumeClaim
: -
claimName
: a PVC name which will be requested when a pod will be created -
containers
: -
volumeMounts
: mount the pv-dynamic-storage volume to the/usr/share/nginx/html
directory in the pod
Create it:
$ kubectl apply -f pv-pods.yaml
pod/pv-dynamic-pod created
Check again our PVC:
$ kubectl get pvc pvc-dynamic
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-dynamic Bound pvc-6d024b40-a239–4c35–8694-f060bd117053 5Gi RWO gp2 21h
Now we can see a new Volume with the ID pvc-6d024b40-a239–4c35–8694-f060bd117053 — check it:
$ kubectl describe pvc pvc-dynamic
Name: pvc-dynamic
Namespace: default
StorageClass: gp2
Status: Bound
Volume: pvc-6d024b40-a239–4c35–8694-f060bd117053
…
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 5Gi
Access Modes: RWO
VolumeMode: Filesystem
Events: <none>
Mounted By: pv-dynamic-pod
Check that volume:
$ kubectl describe pv pvc-6d024b40-a239–4c35–8694-f060bd117053
Name: pvc-6d024b40-a239–4c35–8694-f060bd117053
…
StorageClass: gp2
Status: Bound
Claim: default/pvc-dynamic
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 5Gi
Node Affinity:
Required Terms:
Term 0: failure-domain.beta.kubernetes.io/zone in [us-east-2b]
failure-domain.beta.kubernetes.io/region in [us-east-2]
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-2b/vol-040a5e004876f1a40
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
And the AWS EBS vol-040a5e004876f1a40:
$ aws ec2 — profile arseniy — region us-east-2 describe-volumes — volume-ids vol-040a5e004876f1a40 — output json
{
“Volumes”: [
{
“Attachments”: [
{
“AttachTime”: “2020–07–30T11:08:29.000Z”,
“Device”: “/dev/xvdcy”,
“InstanceId”: “i-0a3225e9fe7cb7629”,
“State”: “attached”,
“VolumeId”: “vol-040a5e004876f1a40”,
“DeleteOnTermination”: false
}
],
…
Check inside of the pod:
$ kk exec -ti pv-dynamic-pod bash
root@pv-dynamic-pod:/# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 50G 0 disk
|-nvme0n1p1 259:1 0 50G 0 part /etc/hosts
`-nvme0n1p128 259:2 0 1M 0 part
nvme1n1 259:3 0 5G 0 disk /usr/share/nginx/html
nvme1n1 — here is our partition.
Let’s write some data:
root@pv-dynamic-pod:/# echo Test > /usr/share/nginx/html/index.html
Drop the pod:
$ kk delete pod pv-dynamic-pod
pod “pv-dynamic-pod” deleted
Re-create it:
$ kubectl apply -f pv-pods.yaml
pod/pv-dynamic-pod created
Check the data:
$ kk exec -ti pv-dynamic-pod cat /usr/share/nginx/html/index.html
Test
Everything is still in its place.
Static PersistentVolumeClaim
Now, let’s try to use our statically created PV.
We can use the same manifest - the pv-static.yaml
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
storageClassName: gp2
awsElasticBlockStore:
fsType: ext4
volumeID: vol-0928650905a2491e2
And let’s use the pvc-static.yaml
manifest for our PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-static
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
volumeName: pv-static
Create the PV:
$ kk apply -f pv-static.yaml
persistentvolume/pv-static created
Check it:
$ kk get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-static 5Gi RWO Retain Available gp2 58s
…
Create the PVC:
$ kk apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
Check it:
$ kk get pvc pvc-static
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-static Bound pv-static 5Gi RWO gp2 9s
STATUS Bound means that PVC was able to find its PV and was successfully connected.
Pod nodeAffinity
Next, we need to determine an AWS AvailabilityZone where is our AWS EBS for the Static PV was created:
$ aws ec2 — profile arseniy — region us-east-2 describe-volumes — volume-ids vol-0928650905a2491e2 — query '[Volumes[\*].AvailabilityZone]' — output text
us-east-2a
us-east-2a - okay, then we need to create a pod on a Kubernetes Worker Node in the same AvailabilityZone.
Create a manifest:
apiVersion: v1
kind: Pod
metadata:
name: pv-static-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east-2a
volumes:
- name: pv-static-storage
persistentVolumeClaim:
claimName: pvc-static
containers:
- name: pv-static-container
image: nginx
ports:
- containerPort: 80
name: "nginx"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: pv-static-storage
As opposed to the Dynamic PVC - here we’ve used the nodeAffinity
to specify that we want to use a node from the s-east-2a AZ.
Create that pod:
$ kk apply -f pv-pod-stat.yaml
pod/pv-static-pod created
Check events:
0s Normal Scheduled Pod Successfully assigned default/pv-static-pod to ip-10–3–47–58.us-east-2.compute.internal
0s Normal SuccessfulAttachVolume Pod AttachVolume.Attach succeeded for volume “pv-static”
0s Normal Pulling Pod Pulling image “nginx”
0s Normal Pulled Pod Successfully pulled image “nginx”
0s Normal Created Pod Created container pv-static-container
0s Normal Started Pod Started container pv-static-container
Partitions in the pod:
$ kk exec -ti pv-static-pod bash
root@pv-static-pod:/# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 50G 0 disk
|-nvme0n1p1 259:1 0 50G 0 part /etc/hosts
`-nvme0n1p128 259:2 0 1M 0 part
nvme1n1 259:3 0 50G 0 disk /usr/share/nginx/html
nvme1n1 is mounted, all works.
PersistentVolume nodeAffinity
Another option could be nodeAffinity
for the PersistentVolume.
Is this case when creating a pod that will use this PV, Kubernetes first will check which Worker Nodes can be used to attach this volume to, and then will create a pod on such a node.
In the pod’s manifest delete the nodeAffinity
:
apiVersion: v1
kind: Pod
metadata:
name: pv-static-pod
spec:
volumes:
- name: pv-static-storage
persistentVolumeClaim:
claimName: pvc-static
containers:
- name: pv-static-container
image: nginx
ports:
- containerPort: 80
name: "nginx"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: pv-static-storage
And add to the PV’s manifest:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-static
spec:
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- us-east-2a
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
storageClassName: gp2
awsElasticBlockStore:
fsType: ext4
volumeID: vol-0928650905a2491e2
Create this PV:
$ kk apply -f pv-static.yaml
persistentvolume/pv-static created
Create its PVC - nothing was changed here:
$ kk apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
Create the pod:
$ kk apply -f pv-pod-stat.yaml
pod/pv-static-pod created
Check logs:
0s Normal Scheduled Pod Successfully assigned default/pv-static-pod to ip-10–3–47–58.us-east-2.compute.internal
0s Normal SuccessfulAttachVolume Pod AttachVolume.Attach succeeded for volume “pv-static”
0s Normal Pulling Pod Pulling image “nginx”
0s Normal Pulled Pod Successfully pulled image “nginx”
0s Normal Created Pod Created container pv-static-container
0s Normal Started Pod Started container pv-static-container
Delete PersistentVolume and PersistentVolumeClaim
When a user wants to delete a PVC that is currently used by a live pod, such a PVC will not be deleted immediately - it will be present until a corresponding pod is running.
Similarly, when deleting a PersistentVolume that has a binding from a PersistentVolumeClaim such a PV will not be deleted until such a binding present, e.g. until its PVC is present.
Reclaiming
Documentation is here>>>.
When we want to finish work with our PersistentVolume, we can delete it from a cluster to release a corresponding AWS EBS (reclaim).
The Reclaim policy for a PersistentVolume specifies to a cluster what it has to do with such a released volume and can have Retained, Recycled, or Deleted values.
Retain
The Retain policy allows us to clean up a disk manually.
After deleting related PersistentVolumeClaim, a PersistentVolume will not be deleted, and will be marked as “released”, but it will be available for new PersistentVolumeClaims as it still keeps some data from the previous PersistentVolumeClaim.
To make it available for the next use, you need to delete the PersistentVolume object from the cluster.
Delete
With the Delete value, when you delete a PVC it will drop its corresponding PersistentVolume and volume's device such as AWS EBS, GCE PD, or Azure Disk.
Keep in mind, that volumes created in the dynamic way will inherit policy from the StorageClass
used, which is by default set to the Delete.
Recycle
Deprecated, was used to delete a data via common rm -rf
.
Deleting PV and PVC — an example
So, we have a pod running:
$ kk get pod pv-static-pod
NAME READY STATUS RESTARTS AGE
pv-static-pod 1/1 Running 0 19s
Which is using a PVC:
$ kk get pvc pvc-static
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-static Bound pv-static 50Gi RWO gp2 19h
And this PVC is bound to the PV:
$ kk get pv pv-static
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-static 50Gi RWO Retain Bound default/pvc-static gp2 19h
And our PV has its RECLAIM POLICY
set to the Retain - so, after we will drop its PVC and PV all data must be kept.
Let’s check — add some data:
$ kk exec -ti pv-static-pod bash
root@pv-static-pod:/# echo Test > /usr/share/nginx/html/test.txt
root@pv-static-pod:/# cat /usr/share/nginx/html/test.txt
Test
Exit from the pod and delete it, and then its PVC:
$ kubectl delete pod pv-static-pod
pod “pv-static-pod” deleted
kubectl delete pvc pvc-static
persistentvolumeclaim “pvc-static” deleted
Check the PV’s status:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-static 50Gi RWO Retain Released default/pvc-static gp2 25s
STATUS
== Released, and at this moment we are not able to attach this volume again via a new PVC.
Let’s check - create a PVC again:
$ kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
Create a pod:
$ kubectl apply -f pv-pod-stat.yaml
pod/pv-static-pod created
And check its PVC status:
$ kubectl get pvc pvc-static
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-static Pending pv-static 0 gp2 59s
The STATUS
is Pending.
Delete the pod, PVC and at this time - delete the PersistentVolume too:
$ kubectl delete -f pv-pod-stat.yaml
pod “pv-static-pod” deleted
$ kubectl delete -f pvc-static.yaml
persistentvolumeclaim “pvc-static” deleted
$ kubectl delete -f pv-static.yaml
persistentvolume “pv-static” deleted
Create all over again:
$ kubectl apply -f pv-static.yaml
persistentvolume/pv-static created
$ kubectl apply -f pvc-static.yaml
persistentvolumeclaim/pvc-static created
$ kubectl apply -f pv-pod-stat.yaml
pod/pv-static-pod created
Check the PVC:
$ kubectl get pvc pvc-static
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-static Bound pv-static 50Gi RWO gp2 27s
And check the data we’ve added earlier:
$ kubectl exec -ti pv-static-pod cat /usr/share/nginx/html/test.txt
Test
All good - the data is still in its place.
Changing Reclaim Policy for PersistentVolume
Documentation is here>>>.
Currently, our PV has the Retain value:
$ kubectl get pv pv-static -o jsonpath=’{.spec.persistentVolumeReclaimPolicy}’
Retain
Apply a patch - update its persistentVolumeReclaimPolicy
parameter to the Delete value:
$ kubectl patch pv pv-static -p ‘{“spec”:{“persistentVolumeReclaimPolicy”:”Delete”}}’
persistentvolume/pv-static patched
Check it:
$ kubectl get pv pv-static -o jsonpath=’{.spec.persistentVolumeReclaimPolicy}’
Delete
Delete the pod and its PVC:
$ kubectl delete -f pv-pod-stat.yaml
pod “pv-static-pod” deleted
$ kubectl delete -f pvc-static.yaml
persistentvolumeclaim “pvc-static” deleted
Check the PersistentVolume:
$ kubectl get pv pv-static
Error from server (NotFound): persistentvolumes “pv-static” not found
And an AWS EBS which was used for this PV:
$ aws ec2 --profile arseniy --region us-east-2 describe-volumes --volume-ids vol-0928650905a2491e2
An error occurred (InvalidVolume.NotFound) when calling the DescribeVolumes operation: The volume ‘vol-0928650905a2491e2’ does not exist.
Actually, that’s all.
Useful links
- Topology-Aware Volume Provisioning in Kubernetes
- Using preexisting persistent disks as PersistentVolumes
- Persistent volumes with persistent disks
- Kubernetes Persistent Storage: Why, Where and How
- Stateful Containers on Kubernetes using Persistent Volume and Amazon EBS
Originally published at RTFM: Linux, DevOps, and system administration.
Posted on August 5, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.