Kubernetes Volumes: the definitive guide (Part 2)
Abhishek Gupta
Posted on September 27, 2019
Welcome to yet another part of the "Kubernetes in a Nutshell" blog series which focuses on the βbreadthβ of Kubernetes and covers fundamental topics such as orchestrating Stateless apps, how to configure Kubernetes apps using ConfigMap
etc. I hope you have enjoyed it so far!
This is a continuation of the previous blog which covered the basics of Kubernetes Volume
s. In this part, we will notch it up a bit and:
- Learn about
PersistentVolume
,PersistentVolumeClaim
objects and how they work in tandem - Dive into types of provisioning in Kubernetes - Static, Dynamic
- Learn about Storage Classes and how they power dynamic provisioning
- Explore relevant examples
Pre-requisites:
To follow the example in this blog, you will need the following:
- A Microsoft Azure account - go ahead and sign up for a free one!
- Azure Kubernetes Service (AKS) cluster - this blog will guide you through the process of creating one
- Azure CLI or Azure Cloud Shell - you can either choose to install the Azure CLI if you don't have it already (should be quick!) or just use the Azure Cloud Shell from your browser.
-
kubectl
to interact with your AKS cluster
The code is available on GitHub
The previous episode....
... concluded with a discussion about "The need for persistent storage" given the fact that lifecycle of vanilla Kubernetes Volume
s are tightly coupled with Pod
and serious apps need stable, persistent storage which outlasts the Pod
or even the Node
on which the Pod
is running.
Examples of long term storage medium are networked file systems (NFS, Ceph, GlusterFS etc.) or cloud based options, such as Azure Disk, Amazon EBS, GCE Persistent Disk etc.
Here is a snippet that shows how you can mount an NFS (Network File System) into your Pod
using the nfs
volume type. You can point to an existing NFS instance using the server
attribute.
spec:
volumes:
- name: app-data
nfs:
server: nfs://localhost
path: "/"
containers:
- image: myapp-docker-image
name: myapp
volumeMounts:
- mountPath: /data
name: app-data
Is this π good enough?
In the above Pod
manifest, storage info (for NFS) is directly specified in the Pod
(using the volumes
section). This implies that the developer needs to know the details of the NFS server, including its location etc. There is definitely scope for improvement here and like most things in software, it can be done with another level of indirection or abstraction using concepts of Persistent Volume and Persistent Volume Claim.
The key idea revolves around "segregation of duties" and decoupling storage creation/management from its requirement/request. This is where PersistentVolumeClaim
and PersistentVolume
come into play:
- A
PersistentVolumeClaim
allows a user to request for persistent storage in a "declarative" fashion by specifying the requirements (e.g. amount of storage) as a part of thePersistentVolumeClaim
spec. - A
PersistentVolume
complements thePersistentVolumeClaim
and represents the storage medium in the Kubernetes cluster. The actual provisioning of the storage (e.g. creation of Azure Disk using Azure CLI, Azure portal, etc.) and creation of thePersistentVolume
in the cluster is typically done an administrator or in the case of Dynamic provisioning, by Kubernetes itself (to be covered later)
In addition to decoupling and segregation of duties, it also provides flexibility and portability. For e.g. you have multiple environments like dev, test, production etc. With a PersistentVolume
, you declare the storage requirements once (e.g. "my app needs 5 GB") and switch the actual storage medium depending on the environment, thanks to PersistentVolumeClaim
- this could be a local disk in dev env, a standard HDD in test and SSD in production. Same goes for portability in a multi-cloud scenario, where you could use the same request spec but switch the PersistentVolume
as per cloud provider
The upcoming sections will cover examples to help reinforce these concepts.
Deep dive
PersistentVolumeClaim
A PersistentVolumeClaim
is just another Kubernetes object (like Pod
, Deployment
, ConfigMap
etc.). Here is an example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-volume-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: volume-class
The most important section is the spec
, which is a reference to a PersistentVolumeClaimSpec
object - this is where you define the storage requirements. The important attributes to focus on are:
-
resources
- minimum resources that the volume requires -
accessModes
- ways the volume can be mounted (valid values areReadWriteOnce
,ReadOnlyMany
,ReadWriteMany
) -
storageClassName
- name of theStorageClass
required by the claim (StorageClass
is covered in another section)
PersistentVolumeClaim
has other attributesapiVersion
,kind
,metadata
,status
. These are common to all Kubernetes objects.
PersistentVolume
This is what a typical PersistentVolume
spec looks like:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pvc
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
storageClassName: volume-class
nfs:
server: nfs://localhost
path: "/"
Just like PersistentVolumeClaim
, spec
(PersistentVolumeSpec
object) is the most important part of a PersistentVolume
- let's dissect it further:
- provider/storage specific - like
nfs
,azureDisk
,gcePersistentDisk
,awsElasticBlockStore
etc. which allow you to provide info specific to the storage medium (NFS, Azure Disk etc.) -
accessModes
- ways in which the volume can be mounted -
capacity
- info about persistent volume's resources and capacity. -
storageClassName
- name ofStorageClass
to which this persistent volume belongs (StorageClass
will be covered soon) -
persistentVolumeReclaimPolicy
- what happens when correspondingPersistentVolumeClaim
is deleted - options areRetain
,Delete
, andRecycle
(deprecated)
As homework, please explore the attributes nodeAffinity
, volumeMode
, mountOptions
to determine what role they play
PersistentVolume
has other attributes -apiVersion
,kind
,metadata
,status
. These are common to all Kubernetes objects.
How do these objects work together?
There are two ways in which you can use these constructs to get storage for your Kubernetes apps - Static and Dynamic.
In the "Static" mode, the user(s) need to take care of provisioning the actual storage (cloud, on-prem, etc.) and then referencing it in the Pod
spec (your application)
In the "Dynamic" way, Kubernetes does the heavy lifting of the storage provisioning as well the creation of PersistentVolume
. All you do is provide your storage requirements by creating and then referencing a PersistentVolumeClaim
in the Pod
spec
Dynamic provisioning should be enabled on a cluster - in most providers, this is done out of the box
Let's explore static provisioning
Static provisioning
There are two ways of using static provisioning:
One of them is to provision storage and use its info directly in the Pod
spec
I have mentioned this already but this is the last time I will do so (in this context) and also recommend trying out the excellent tutorial on how to "Manually create and use a volume with Azure disks in Azure Kubernetes Service (AKS)". This is what it looks like (and as you've read before, this is convenient but has its limitations)
spec:
containers:
- image: nginx
name: mypod
volumeMounts:
- name: azure
mountPath: /mnt/azure
volumes:
- name: azure
azureDisk:
kind: Managed
diskName: myAKSDisk
diskURI: /subscriptions/<subscriptionID>/resourceGroups/MC_myAKSCluster_myAKSCluster_eastus/providers/Microsoft.Compute/disks/myAKSDisk
In the second approach, instead of creating the disk and providing its details (azureDisk
in this case), you encapsulate that info in a PersistentVolume
. Then you create a PersistentVolumeClaim
and reference it from the Pod
spec and leave it to Kubernetes to match the storage requirements with what's available
Here is a snippet to give you an idea
spec:
volumes:
- name: app-data
persistentVolumeClaim:
claimName: data-volume-claim
Think of it as refactoring a piece of logic into its own method - you take a bunch of storage request info and externalize it in the form of a
PersistentVolume
(analogous to a method).
Dynamic provisioning
As mentioned earlier, with Dynamic Provisioning, you can offload all the heavy lifting to Kubernetes. Before we dive in, here is a snapshot of how it works
One of the key concepts associated with dynamic provisioning is StorageClass
Storage Class
Just like a PersistentVolume
encapsulates storage details, a StorageClass
provides a way to describe the "classes" of storage. In order to use a StorageClass
, all you need to do is reference it from the PersistentVolumeClaim
.
Let's understand this practically - here is an example of a StorageClass
for an Azure Disk.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
kubernetes.io/cluster-service: "true"
name: default
parameters:
cachingmode: ReadOnly
kind: Managed
storageaccounttype: Standard_LRS
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete
volumeBindingMode: Immediate
The key parameters in a StorageClass
spec are:
-
provisioner
- volume plugin (details to follow) which provisions the actual storage -
parameters
- custom key value pairs which can be used be the provisioner at runtime -
reclaimPolicy
- reclaim policy with which thePersistentVolume
is created (Delete
the PV gets deleted when PVC get deleted, Retain to keep the PV) -
volumeBindingMode
- indicates howPersistentVolumeClaim
s should be provisioned and bound (valid values areImmediate
andWaitForFirstConsumer
)
The information in these parameters (and few others like allowVolumeExpansion
, allowedTopologies
, mountOptions
) are used at runtime to dynamically provision the storage and create the corresponding PersistentVolume
.
StorageClass
has other attributes as well -apiVersion
,kind
,metadata
. These are common to all Kubernetes objects.
What is a provisioner
?
The provisioner
is the heart of dynamic provisioning - it is a plugin that includes custom logic meant to create storage resources of a specific type. Kubernetes ships along with a bunch of provisioners, including cloud based ones such as Azure Disk (kubernetes.io/azure-disk
), Azure File (kubernetes.io/azure-file
), GCE Persistent Disk, AWS EBS etc.
In the above example,
kubernetes.io/azure-disk
is being used as the provisioner
The parameters
section provides a means of passing information to the parameter at runtime - this is obviously specific to a provisioner
. In the above example cachingmode
, storageaccounttype
and kind
are passed as parameters to the kubernetes.io/azure-disk
provisioner - this allows for a lot of flexibility.
If a parameter is not passed, a default value is used
A note on default Storage Class
A StorageClass
can be marked as default such that it is used (for dynamic provisioning) when a storageClass
attribute is not provided in the PersistentVolumeClaim
.
Azure Kubernetes Service makes dynamic provisioning easy by including two pre-seeded storage classes. You can check the same by running kubectl get storageclass
command
NAME PROVISIONER AGE
default (default) kubernetes.io/azure-disk 6d10h
managed-premium kubernetes.io/azure-disk 6d10h
-
default
storage class: provisions a standard Azure Disk backed by a Standard HDD -
managed-premium
storage class: provisions a premium Azure Disk backed by Premium SSD
Hands-on: Dynamic provisioning
It's time to try things out Dynamic provisioning using Azure Kubernetes Service. You will create a PersistenceVolumeClaim
, a simple application (Deployment
) which references that claim and see how things work in practice.
If you don't have an Azure account already, now is the time to sign up for a free one and get cracking!
Kubernetes cluster setup
You need a single command to stand up a Kubernetes cluster on Azure. But, before that, we'll have to create a resource group
export AZURE_SUBSCRIPTION_ID=[to be filled]
export AZURE_RESOURCE_GROUP=[to be filled]
export AZURE_REGION=[to be filled] (e.g. southeastasia)
Switch to your subscription and invoke az group create
az account set -s $AZURE_SUBSCRIPTION_ID
az group create -l $AZURE_REGION -n $AZURE_RESOURCE_GROUP
You can now invoke az aks create
to create the new cluster
To keep things simple, the below command creates a single node cluster. Feel free to change the specification as per your requirements
export AKS_CLUSTER_NAME=[to be filled]
az aks create --resource-group $AZURE_RESOURCE_GROUP --name $AKS_CLUSTER_NAME --node-count 1 --node-vm-size Standard_B2s --node-osdisk-size 30 --generate-ssh-keys
Get the AKS cluster credentials using az aks get-credentials
- as a result, kubectl
will now point to your new cluster. You can confirm the same
az aks get-credentials --resource-group $AZURE_RESOURCE_GROUP --name $AKS_CLUSTER_NAME
kubectl get nodes
If you are interested in learning Kubernetes and Containers using Azure, a good starting point is to use the quickstarts, tutorials and code samples in the documentation to familiarize yourself with the service. I also highly recommend checking out the 50 days Kubernetes Learning Path. Advanced users might want to refer to Kubernetes best practices or the watch some of the videos for demos, top features and technical sessions.
Create PersistentVolumeClaim followed by app Deployment
Here is the PersistentVolumeClaim
spec which we will use
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-disk-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
Notice that the PersistenceVolumeClaim
did not use storageClass
- this is to ensure that the default
storage class is used for dynamic provisioing.
Create the PersistenceVolumeClaim
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kubernetes-in-a-nutshell/master/volumes-2/azure-disk-pvc.yaml
If you check it, you will see something like this (STATUS
= Pending
)
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
azure-disk-pvc Pending default 11s
After some time, it should change to (STATUS
= Bound
) - this is because the Azure Disk and PersistenceVolume
got created automatically
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
azure-disk-pvc Bound pvc-7b0e2911-df74-11e9-93ab-025752f370d3 2Gi RWO default 36s
You can check the dynamically provisioned PersistenceVolume
as well - kubectl get pv
Confirm that the Azure Disk has been created
AKS_NODE_RESOURCE_GROUP=$(az aks show --resource-group $AZURE_RESOURCE_GROUP --name $AKS_CLUSTER_NAME --query nodeResourceGroup -o tsv)
az disk list -g $AKS_NODE_RESOURCE_GROUP
The tags
section will look something similar to
"tags": {
"created-by": "kubernetes-azure-dd",
"kubernetes.io-created-for-pv-name": "pvc-7b0e2911-df74-11e9-93ab-025752f370d3",
"kubernetes.io-created-for-pvc-name": "azure-disk-pvc",
"kubernetes.io-created-for-pvc-namespace": "default"
}
Create the app Deployment
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kubernetes-in-a-nutshell/master/volumes-2/app.yaml
To test it out, we will use a simple Go app. All it does is push log statments to a file logz.out
in /mnt/logs
- this is the path which is mounted into the Pod
Wait for a while for the deployment to be in Running
state
kubectl get pods -l=app=logz
NAME READY STATUS RESTARTS AGE
logz-deployment-59b75bc786-wt98d 1/1 Running 0 15s
To confirm, check the mnt/logs/logz.out
in the Pod
kubectl exec -it $(kubectl get pods -l=app=logz --output=jsonpath={.items..metadata.name}) -- tail -f /mnt/logs/logz.out
You will see the logs (just the timestamp) every 3 seconds
2019-09-25 09:17:11.960671937 +0000 UTC m=+84.002677518
2019-09-25 09:17:14.961347341 +0000 UTC m=+87.003352922
2019-09-25 09:17:17.960697766 +0000 UTC m=+90.002703347
2019-09-25 09:17:20.960666399 +0000 UTC m=+93.002671980
That brings us to the end of this two-part series on Kubernetes Volumes. How did you find this article? Did you learn something from it? Did it help solve a problem, resolve that lingering query you had? ππ Or maybe it needs improvement π‘ Please provide your feedback - its really valuable and I would highly appreciate it! You can reach out via Twitter or just drop in a comment right below to start a discussion.
As I mentioned earlier, this was a sub-part of the larger series of blogs in "Kubernetes in a Nutshell" and there is more to come! Please don't forget to like and follow π
Posted on September 27, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 16, 2019