Testing Kubernetes with Kuttl
Andrew Minkin
Posted on December 20, 2022
Automated testing is the only way to be sure that your code works. Enabling automated testing can be hard and we say a lot of tools to write automated tests in the industry since the beginning. Some veterans in the industry may remember Selenium, Cucumber frameworks that help automate testing in the browser. However, testing in Kubernetes can be hard.
In Percona we deal with Kubernetes and have different operators to automate the management of databases. It requires testing. A lot of testing. We have different frameworks to help us with it
- Codecept.js to write UI tests for PMM. Also, we use a playwright for some cases.
- We have tools to help us with API testing as well as automating some routines by running bash commands during the test step.
However, those frameworks are not applicable to test Kubernetes workloads as well as Kubernetes operators. I've been working in the PMM integrations team for six months and saw different approaches to automate testing for PMM/DBaaS. We have a Go test library with wrappers around kubectl and codecept.js for end-to-end tests for the User Interface.
What challenges do we have?
Well, to be sure that a database cluster creation works we need to automate the following steps
- Installation of operators to Kubernetes cluster
- Test integration with version service to respect compatibility matrix.
- Create a database cluster and wait once it'll be available
- Do some assertions against Kubernetes as well as UI.
The main pain point here is that we need to wait up to 10-15 minutes for each step and we can't have different test cases to cover as many cases as we can. Yet we can achieve some performance benefits by paralleling workloads, still, it requires learning Javascript and testing framework to work with it. We had some architectural changes recently and moved from our custom gRPC API to create and manage database clusters to an operator that runs on top of other operators and converts K8s' Custom Resource from generic format to operator specific. We had a couple of options for this new project and after research, we chose kuttl as a framework for integration/e2e testing.
What is KUTTL anyway and why should I care?
KUTTL is the KUbernetes Test TooL. It's written in Go and provides a declarative way to test Kubernetes operators using Kubernetes primitives. It's easy to start kuttling. Let's take a deeper look. I'll use dbaas-operator as an example. DBaaS-operator is an operator that has a simple and generic Custom Resource Definition available to create Percona Server MongoDB or Percona XtraDB Cluster instances in Kubernetes. It uses underlying operators as a dependency. We have the following structure
e2e-tests
├── kind.yml
├── kuttl-eks.yml
├── kuttl.yml
└── tests
└── pxc
├── 00-assert.yaml
├── 00-deploy-operators.yaml
├── 01-assert.yaml
├── 01-deploy-pxc.yaml
├── 02-assert.yaml
├── 02-upgrade-pxc.yaml
├── 03-assert.yaml
├── 03-restart-pxc.yaml
├── 04-delete-cluster.yaml
├── 05-assert.yaml
├── 05-create-cluster.yaml
├── 06-assert.yaml
├── 06-scale-up-pxc.yaml
├── 07-assert.yaml
├── 07-scale-down-pxc.yaml
└── 08-delete-cluster.yaml
2 directories, 19 files
Let's discuss these YAML files more
- kind.yml contains settings to run Kind
- kuttl.yml has all the required settings for the Kuttl framework and kuttl-eks.yml has some EKS-specific configurations
- tests folder has test steps and assertions
Kind and KUTTL settings
Let's discuss Kind and kuttl settings and I'll start with KUTTL first
apiVersion: kuttl.dev/v1beta1
kind: TestSuite
kindConfig: e2e-tests/kind.yml # Path to Kind config that will be used to create Kind clusters
crdDir: config/crd/bases # Path to a directory that contains CRD files. Kuttl will apply them before running tests
artifactsDir: /tmp/ # Path to a directory to store artifacts such as logs and other information
testDirs:
- e2e-tests/tests # Path to directories that have test steps
Kind config is quite easy
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
endpoint = ["http://kind-registry:5000"]
The aforementioned config will use a local registry and will create 3 k8s worker nodes controlled by the control plane
Writing tests
At the first glance, kuttling can be easy because it uses Kubernetes primitives as a test step and assertion but I had a couple of problems testing my operator. Let's take a look at a couple of examples. Since dbaas-operator depends on the PXC operator we need to prepare our environment for testing. Let's write the first test that installs the PXC operator and ensures that it was installed.
cat e2e-tests/tests/pxc/00-deploy-pxc-operator.yml
apiVersion: kuttl.dev/v1beta1
kind: TestStep
timeout: 10 # Timeout for the test step
commands:
- command: kubectl apply -f https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v${PXC_OPERATOR_VERSION}/deploy/bundle.yaml -n "${NAMESPACE}"
KUTTL test steps easily extensible with commands. One can run even scripts as a prerequisite for a test case. PXC operator installs CRDs and creates a deployment and here's an example of assertion.
cat e2e-tests/tests/pxc/00-assert.yml
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
timeout: 120 # Timeout waiting for the state
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: perconaxtradbclusters.pxc.percona.com
spec:
group: pxc.percona.com
names:
kind: PerconaXtraDBCluster
listKind: PerconaXtraDBClusterList
plural: perconaxtradbclusters
shortNames:
- pxc
- pxcs
singular: perconaxtradbcluster
scope: Namespaced
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databaseclusters.dbaas.percona.com
spec:
group: dbaas.percona.com
names:
kind: DatabaseCluster
listKind: DatabaseClusterList
plural: databaseclusters
shortNames:
- db
singular: databasecluster
scope: Namespaced
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: percona-xtradb-cluster-operator
status:
availableReplicas: 1
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
Our first test is ready and one needs to run kubectl kuttl test --config ./e2e-tests/kuttl.yml
to run kuttl.
More advanced tests
We need to run our operator first to be able to work with resources and test it. KUTTL recomends configure it via TestSuite
by the following example
apiVersion: kuttl.dev/v1beta1
kind: TestSuite
...
commands:
- command: ./bin/manager
...
However, since dbaas-operator depends on underlying operators it needs to work correctly even if they are not present in a Kubernetes cluster. It has the following logic in the controller
// SetupWithManager sets up the controller with the Manager.
func (r *DatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
fmt.Println(os.Getenv("WATCH_NAMESPACE"))
unstructuredResource := &unstructured.Unstructured{}
unstructuredResource.SetGroupVersionKind(schema.GroupVersionKind{
Group: "apiextensions.k8s.io",
Kind: "CustomResourceDefinition",
Version: "v1",
})
controller := ctrl.NewControllerManagedBy(mgr).
For(&dbaasv1.DatabaseCluster{})
err := r.Get(context.Background(), types.NamespacedName{Name: pxcCRDName}, unstructuredResource)
if err == nil {
if err := r.addPXCToScheme(r.Scheme); err == nil {
controller.Owns(&pxcv1.PerconaXtraDBCluster{})
}
}
err = r.Get(context.Background(), types.NamespacedName{Name: psmdbCRDName}, unstructuredResource)
if err == nil {
if err := r.addPSMDBToScheme(r.Scheme); err == nil {
controller.Owns(&psmdbv1.PerconaServerMongoDB{})
}
}
return controller.Complete(r)
}
The controller.Owns
sets up a controller to watch specified resources and once they were changed it'll run a reconciliation loop to sync changes. Also, it checks that operator is present in the cluster by checking that deployment and CRDs are available. It means that to make the operator work correctly in tests we need to choose from the following options
- Restart operator once the upstream operator was installed by sending
HUP
signal - Run the operator only after the underlying operator is present in a cluster
Hence, I moved the command as the next step before creating a cluster. You can see it below
cat e2e-tests/tests/pxc/01-deploy-pxc.yml
apiVersion: kuttl.dev/v1beta1
kind: TestStep
timeout: 10
commands:
- script: WATCH_NAMESPACE=$NAMESPACE ../../../bin/manager
background: true
---
apiVersion: dbaas.percona.com/v1
kind: DatabaseCluster
metadata:
name: test-cluster
spec:
databaseType: pxc
databaseImage: percona/percona-xtradb-cluster:8.0.23-14.1
databaseConfig: |
[mysqld]
wsrep_provider_options="debug=1;gcache.size=1G"
secretsName: pxc-sample-secrets
clusterSize: 1
loadBalancer:
type: haproxy
exposeType: ClusterIP
size: 1
image: percona/percona-xtradb-cluster-operator:1.11.0-haproxy
dbInstance:
cpu: "1"
memory: 1G
diskSize: 15G
Note: command
supports only simple commands and does not fully support env variables. It supports only $NAMESPACE, $PATH and $HOME. However, script
solves the problem of setting WATCH_NAMESPACE
environment variable.
In nutshell, the test step above does two things:
- Runs the operator
- Creates a database cluster
The assertion checks that kubernetes cluster has the DatabaseCluster
object with ready
status as well as PXC cluster with the same status.
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
timeout: 600
---
apiVersion: dbaas.percona.com/v1
kind: DatabaseCluster
metadata:
name: test-cluster
spec:
databaseType: pxc
databaseImage: percona/percona-xtradb-cluster:8.0.23-14.1
databaseConfig: |
[mysqld]
wsrep_provider_options="debug=1;gcache.size=1G"
secretsName: pxc-sample-secrets
clusterSize: 1
loadBalancer:
type: haproxy
exposeType: ClusterIP
size: 1
image: percona/percona-xtradb-cluster-operator:1.11.0-haproxy
dbInstance:
cpu: "1"
memory: 1G
diskSize: 15G
---
apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBCluster
metadata:
name: test-cluster
spec:
allowUnsafeConfigurations: true
crVersion: 1.11.0
haproxy:
enabled: true
image: percona/percona-xtradb-cluster-operator:1.11.0-haproxy
serviceType: ClusterIP
size: 1
pxc:
configuration: |
[mysqld]
wsrep_provider_options="debug=1;gcache.size=1G"
expose: {}
image: percona/percona-xtradb-cluster:8.0.23-14.1
livenessProbes: {}
readinessProbes: {}
resources:
requests:
cpu: "1"
memory: 1G
serviceType: ClusterIP
sidecarResources: {}
size: 1
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 15G
secretsName: pxc-sample-secrets
updateStrategy: SmartUpdate
upgradeOptions:
apply: 8.0-recommended
schedule: 0 4 * * *
status:
ready: 2
size: 2
state: ready
Caveats and notes
I had problems running tests in Kind. They were flaky because PXC operator can't expose metrics and had problems with the liveness probe. I haven't figured out how to fix it but as a workaround, I use minikube to run tests
minikube start --nodes=4 --cpus=2 --memory=4g --apiserver-names host.docker.internal --kubernetes-version=v1.23.6
minikube kubectl -- config view --flatten --minify > ~/.kube/test-minikube
KUBECONFIG=~/.kube/test-minikube kubectl kuttl test --config ./e2e-tests/kuttl.yml
Further steps
There's always room for improvement and I have these steps in mind
- Use docker images and OLM bundles as a way to run the operator for tests. This will be the best way to simulate a production-like environment.
- Add more advanced tests for database clusters such as running queries, trying loading data as well as capacity testing. It's easily achievable with kuttl
Posted on December 20, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024