Kafka on Kubernetes, the Strimzi way! (Part 4)

Welcome to part four of this blog series! So far, we have a Kafka single-node cluster with TLS encryption on top of which we configured different authentication modes (TLS and SASL SCRAM-SHA-512), defined users with the User Operator, connected to the cluster using CLI and Go clients and saw how easy it is to manage Kafka topics with the Topic Operator. So far, our cluster used ephemeral persistence, which in the case of a single-node cluster, means that we will lose data if the Kafka or Zookeeper nodes (Pods) are restarted due to any reason.

Let's march on! In this part we will cover:

How to configure Strimzi to add persistence for our cluster:
Explore the components such as PersistentVolume and PersistentVolumeClaim
How to modify the storage quality
Try and expand the storage size for our Kafka cluster

The code is available on GitHub - https://github.com/abhirockzz/kafka-kubernetes-strimzi/

What do I need to go through this tutorial?

kubectl - https://kubernetes.io/docs/tasks/tools/install-kubectl/

I will be using Azure Kubernetes Service (AKS) to demonstrate the concepts, but by and large it is independent of the Kubernetes provider. If you want to use AKS, all you need is a Microsoft Azure account which you can get for FREE if you don't have one already.

I will not be repeating some of the common sections (such as Installation/Setup (Helm, Strimzi, Azure Kubernetes Service), Strimzi overview) in this or subsequent part of this series and would request you to refer to part one

Add persistence

We will start off by creating a persistent cluster. Here is a snippet of the specification (you can access the complete YAML on GitHub)

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: my-kafka-cluster
spec:
  kafka:
    version: 2.4.0
    replicas: 1
    storage:
      type: persistent-claim
      size: 2Gi
      deleteClaim: true
....
  zookeeper:
    replicas: 1
    storage:
      type: persistent-claim
      size: 1Gi
      deleteClaim: true

The key things to notice:

storage.type is persistent-claim (as opposed to ephemeral) in previous examples
storage.size for Kafka and Zookeeper nodes is 2Gi and 1Gi respectively
deleteClaim: true means that the corresponding PersistentVolumeClaims will be deleted when the cluster is deleted/un-deployed

You can take a look at the reference for storage https://strimzi.io/docs/operators/master/using.html#type-PersistentClaimStorage-reference

To create the cluster:

kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent.yaml

Let's see the what happens in response to the cluster creation

Strimzi Kubernetes magic...

Strimzi does all the heavy lifting of creating required Kubernetes resources in order to operate the cluster. We covered most of these in part 1 - StatefulSet (and Pods), LoadBalancer Service, ConfigMap, Secret etc. In this blog, we will just focus on the persistence related components - PersistentVolume and PersistentVolumeClaim

If you're using Azure Kubernetes Service (AKS), this will create an Azure Managed Disk - more on this soon

To check the PersistentVolumeClaims

kubectl get pvc


NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-my-kafka-cluster-kafka-0       Bound    pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779   2Gi        RWO            default        94s
data-my-kafka-cluster-zookeeper-0   Bound    pvc-d705fea9-c443-461c-8d18-acf8e219eab0   1Gi        RWO            default        3m20s

... and the PersistentVolumes they are Bound to

kubectl get pv


NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                       STORAGECLASS   REASON   AGE
pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779   2Gi        RWO            Delete           Bound    default/data-my-kafka-cluster-kafka-0       default                 107s
pvc-d705fea9-c443-461c-8d18-acf8e219eab0   1Gi        RWO            Delete           Bound    default/data-my-kafka-cluster-zookeeper-0   default                 3m35s

Notice that the disk size is as specified in the manifest ie. 2 and 1 Gib for Kafka and Zookeeper respectively

Where is the data?

If we want to see the data itself, let's first check the ConfigMap which stores the Kafka server config:

export CLUSTER_NAME=my-kafka-cluster
kubectl get configmap/${CLUSTER_NAME}-kafka-config -o yaml

In server.config section, you will find an entry as such:

##########
# Kafka message logs configuration
##########
log.dirs=/var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}

This tells us that the Kafka data is stored in /var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}. In this case STRIMZI_BROKER_ID is 0 since we all we have is a single node

With this info, let's look the the Kafka Pod:

export CLUSTER_NAME=my-kafka-cluster
kubectl get pod/${CLUSTER_NAME}-kafka-0 -o yaml

If you look into the kafka container section, you will notice the following:

One of the volumes configuration:

  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-my-kafka-cluster-kafka-0

The volume named data is associated with the data-my-kafka-cluster-kafka-0 PVC, and the corresponding volumeMounts uses this volume to ensure that Kafka data is stored in /var/lib/kafka/data

    volumeMounts:
    - mountPath: /var/lib/kafka/data
      name: data

To see the contents,

export STRIMZI_BROKER_ID=0
kubectl exec -it my-kafka-cluster-kafka-0 -- ls -lrt /var/lib/kafka/data/kafka-log${STRIMZI_BROKER_ID}

You can repeat the same for Zookeeper node as well

...what about the Cloud?

As mentioned before, in case of AKS, the data will end up being stored in an Azure Managed Disk. The type of disk is as per the default storage class in your AKS cluster. In my case, it is:

kubectl get sc

azurefile           kubernetes.io/azure-file   58d
azurefile-premium   kubernetes.io/azure-file   58d
default (default)   kubernetes.io/azure-disk   2d18h
managed-premium     kubernetes.io/azure-disk   2d18h

//to get details of the storage class
kubectl get sc/default -o yaml

More on the semantics for default storage class in AKS in the documentation

To query the disk in Azure, extract the PersistentVolume info using kubectl get pv/<name of kafka pv> -o yaml and get the ID of the Azure Disk i.e. spec.azureDisk.diskURI

You can use the Azure CLI command az disk show command

az disk show --ids <diskURI value>

You will see that the storage type as defined in sku section is StandardSSD_LRS which corresponds to a Standard SSD

This table provides a comparison of different Azure Disk types

  "sku": {
    "name": "StandardSSD_LRS",
    "tier": "Standard"
  }

... and the tags attribute highlight the PV and PVC association

  "tags": {
    "created-by": "kubernetes-azure-dd",
    "kubernetes.io-created-for-pv-name": "pvc-b4ece32b-a46c-4fbc-9b58-9413eee9c779",
    "kubernetes.io-created-for-pvc-name": "data-my-kafka-cluster-kafka-0",
    "kubernetes.io-created-for-pvc-namespace": "default"
  }

You can repeat the same for Zookeeper disks as well

Quick test ...

Follow these steps to confirm that the cluster is working as expected..

Create a producer Pod:

export KAFKA_CLUSTER_NAME=my-kafka-cluster

kubectl run kafka-producer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-producer.sh --broker-list $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic

In another terminal, create a consumer Pod:

export KAFKA_CLUSTER_NAME=my-kafka-cluster

kubectl run kafka-consumer -ti --image=strimzi/kafka:latest-kafka-2.4.0 --rm=true --restart=Never -- bin/kafka-console-consumer.sh --bootstrap-server $KAFKA_CLUSTER_NAME-kafka-bootstrap:9092 --topic my-topic --from-beginning

What if(s) ...

Let's explore how to tackle a couple of requirements which you'll come across:

Using a different storage type - In case of Azure for example, you might want to use Azure Premium SSD for production workloads
Re-sizing the storage - at some point you'll want to add storage to your Kafka cluster

Change the storage type

Recall that the default behavior is for Strimzi to create a PersistentVolumeClaim that references the default Storage Class. To customize this, you can simply include the class attribute in the storage specification in spec.kafka (and/or spec.zookeeper).

In Azure, the managed-premium storage class corresponds to a Premium SSD: kubectl get sc/managed-premium -o yaml

Here is a snippet from the storage config, where class: managed-premium has been added.

    storage:
      type: persistent-claim
      size: 2Gi
      deleteClaim: true
      class: managed-premium

Please note that you cannot update the storage type for an existing cluster. To try this out:

Delete the existing cluster - kubectl delete kafka/my-kafka-cluster (wait for a while)
Create a new cluster - kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-premium.yaml

//Delete the existing cluster
kubectl delete kafka/my-kafka-cluster

//Create a new cluster
kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-premium.yaml

To confirm, check the PersistentVolumeClain for Kafka node - notice the STORAGECLASS colum

kubectl get pvc/data-my-kafka-cluster-kafka-0

NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
data-my-kafka-cluster-kafka-0   Bound    pvc-3f46d6ed-9da5-4c49-87ef-86684ab21cf8   2Gi        RWO            managed-premium   21s

We only configured the Kafka broker to use the Premium storage, so the Zookeeper Pod will use the StandardSSD storage type.

Re-size storage (TL;DR - does not work yet)

Azure Disks allow you to add more storage to it. In the case of Kubernetes, it is the storage class which defines whether this is supported or not - for AKS, if you check the default (or the managed-premium) storage class, you will notice the property allowVolumeExpansion: true, which confirms that you can do so in the context of Kubernetes PVC as well.

Strimzi makes it really easy to increase the storage for our Kafka cluster - all you need to do is update the storage.size field to the desired value

Check the PVC now: kubectl describe pvc data-my-kafka-cluster-kafka-0

Conditions:
  Type       Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----       ------  -----------------                 ------------------                ------  -------
  Resizing   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 22 Jun 2020 23:15:26 +0530           
Events:
  Type     Reason              Age                From           Message
  ----     ------              ----               ----           -------
  Warning  VolumeResizeFailed  3s (x11 over 13s)  volume_expand  error expanding volume "default/data-my-kafka-cluster-kafka-0" of plugin "kubernetes.io/azure-disk": compute.DisksClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="Cannot resize disk kubernetes-dynamic-pvc-3f46d6ed-9da5-4c49-87ef-86684ab21cf8 while it is attached to running VM /subscriptions/9a42a42f-ae42-4242-b6a7-dda0ea91d342/resourceGroups/mc_my-k8s-vk_southeastasia/providers/Microsoft.Compute/virtualMachines/aks-agentpool-42424242-1. Resizing a disk of an Azure Virtual Machine requires the virtual machine to be deallocated. Please stop your VM and retry the operation."

Notice the "Cannot resize disk... error message. This is happening because the Azure Disk is currently attached with AKS cluster node and that is because of the Pod is associated with the PersistentVolumeClaim - this is a documented limitation

I am not the first one to run into this problem of course. Please refer to issues such as this one for details.

There are workarounds but they have not been discussed in this blog. I included the section since I wanted you to be aware of this caveat

Final countdown ...

We want to leave on a high note, don't we? Alright, so to wrap it up, let's scale our cluster out from one to three nodes. It'd dead simple!

All you need to do is to increase the replicas to the desired number - in this case, I configured it to 3 (for Kafka and Zookeeper)

...
spec:
  kafka:
    version: 2.4.0
    replicas: 3
  zookeeper:
    replicas: 3
...

In addition to this, I also added an external load balancer listener (this will create an Azure Load Balancer, as discussed in part 2)

...
    listeners:
      plain: {}
      external:
        type: loadbalancer
...

To create the new, simply use the new manifest

kubectl apply -f https://raw.githubusercontent.com/abhirockzz/kafka-kubernetes-strimzi/master/part-4/kafka-persistent-multi-node.yaml

Please note that the overall cluster readiness will take time since there will be additional components (Azure Disks, Load Balancer public IPs etc.) that'll be created prior to the Pods being activated

In your k8s cluster, you will see...

Three Pods each for Kafka and Zookeeper

kubectl get pod -l=app.kubernetes.io/instance=my-kafka-cluster

NAME                           READY   STATUS    RESTARTS   AGE
my-kafka-cluster-kafka-0       2/2     Running   0          54s
my-kafka-cluster-kafka-1       2/2     Running   0          54s
my-kafka-cluster-kafka-2       2/2     Running   0          54s
my-kafka-cluster-zookeeper-0   1/1     Running   0          4m44s
my-kafka-cluster-zookeeper-1   1/1     Running   0          4m44s
my-kafka-cluster-zookeeper-2   1/1     Running   0          4m44s

Three pairs (each for Kafka and Zookeeper) of PersistentVolumeClaims ...

kubectl get pvc

NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
data-my-kafka-cluster-kafka-0       Bound    pvc-0f52dee1-970a-4c55-92bd-a97dcc41aee6   3Gi        RWO            managed-premium   10m
data-my-kafka-cluster-kafka-1       Bound    pvc-f8b613cb-3da0-4932-acea-7e5e96df1433   3Gi        RWO            managed-premium   4m24s
data-my-kafka-cluster-kafka-2       Bound    pvc-fedf431c-d87a-4bf7-80d0-d43b1337c079   3Gi        RWO            managed-premium   4m24s
data-my-kafka-cluster-zookeeper-0   Bound    pvc-1fda3714-3c37-428f-9e4b-bdb5da71cda6   1Gi        RWO            default           12m
data-my-kafka-cluster-zookeeper-1   Bound    pvc-702556e0-890a-4c07-ae5c-e2354d74d006   1Gi        RWO            default           6m42s
data-my-kafka-cluster-zookeeper-2   Bound    pvc-176ffd68-7e3a-4e04-abb1-52c54dcb84f0   1Gi        RWO            default           6m42s

... and the respective PersistentVolumes they are bound to

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                       STORAGECLASS      REASON   AGE
pvc-0f52dee1-970a-4c55-92bd-a97dcc41aee6   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-0       managed-premium            12m
pvc-176ffd68-7e3a-4e04-abb1-52c54dcb84f0   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-2   default                    8m45s
pvc-1fda3714-3c37-428f-9e4b-bdb5da71cda6   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-0   default                    14m
pvc-702556e0-890a-4c07-ae5c-e2354d74d006   1Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-zookeeper-1   default                    8m45s
pvc-f8b613cb-3da0-4932-acea-7e5e96df1433   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-1       managed-premium            6m27s
pvc-fedf431c-d87a-4bf7-80d0-d43b1337c079   3Gi        RWO            Delete           Bound       default/data-my-kafka-cluster-kafka-2       managed-premium            6m22s

... and Load Balancer IPs. Notice that these are created for each Kafka broker as well as a bootstrap IP which is recommended when connecting from client applications.

kubectl get svc -l=app.kubernetes.io/instance=my-kafka-cluster

NAME                                        TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)                      AGE
my-kafka-cluster-kafka-0                    LoadBalancer   10.0.11.154    40.119.248.164   9094:30977/TCP               10m
my-kafka-cluster-kafka-1                    LoadBalancer   10.0.146.181   20.43.191.219    9094:30308/TCP               10m
my-kafka-cluster-kafka-2                    LoadBalancer   10.0.223.202   40.119.249.20    9094:30313/TCP               10m
my-kafka-cluster-kafka-bootstrap            ClusterIP      10.0.208.187   <none>           9091/TCP,9092/TCP            16m
my-kafka-cluster-kafka-brokers              ClusterIP      None           <none>           9091/TCP,9092/TCP            16m
my-kafka-cluster-kafka-external-bootstrap   LoadBalancer   10.0.77.213    20.43.191.238    9094:31051/TCP               10m
my-kafka-cluster-zookeeper-client           ClusterIP      10.0.3.155     <none>           2181/TCP                     18m
my-kafka-cluster-zookeeper-nodes            ClusterIP      None           <none>           2181/TCP,2888/TCP,3888/TCP   18m

To access the cluster, you can use the steps outlined in part 2

It's a wrap!

That's it for this blog series on which covered some of the aspects of running Kafka on Kubernetes using the open source Strimzi operator.

If this topic is of interest to you, I encourage you to check out other solutions such as Confluent operator and Banzai Cloud Kafka operator

Blog