Migration from Calico CNI to Cilium CNI in BareMetal Kubernetes Cluster and Monitoring traffic using Hubble UI
AMIT CHATURVEDI
Posted on February 6, 2024
Verify the current running CNI
root@devmaster:~# kubectl get pods -n kube-system | grep calico
calico-kube-controllers-5dd4b7dfd9-j7mfj 1/1 Running 6 (46d ago) 51d
calico-node-lctql 1/1 Running 3 (46d ago) 71d
calico-node-nx7lx 1/1 Running 0 2m34s
calico-node-xdmm7 1/1 Running 2 (46d ago) 71d
Introduction
In the realm of Kubernetes networking, the choice of Container Network Interface (CNI) plays a crucial role in determining performance, security, and scalability. While Calico has long been a popular choice for networking solutions, Cilium has emerged as a compelling alternative, offering advanced features and robust performance enhancements. In this guide, we'll explore the process of migrating from Calico to Cilium CNI, highlighting the benefits and steps involved in making the transition.
Why Migrate from Calico to Cilium?
Before delving into the migration process, it's essential to understand why organizations might consider shifting from Calico to Cilium CNI. While Calico provides solid networking capabilities, Cilium offers several key advantages
Enhanced Performance: Cilium leverages eBPF (extended Berkeley Packet Filter) technology to provide efficient packet processing, resulting in lower latency and improved throughput compared to traditional networking solutions.
Advanced Security: Cilium offers powerful security features, including Layer 7 application-aware security policies, transparent encryption, and network visibility, enabling organizations to strengthen their defense against cyber threats.
Native Integration with Service Mesh: Cilium seamlessly integrates with popular service mesh solutions like Istio, enabling enhanced observability, traffic management, and security within Kubernetes environments.
Rich Feature Set: Cilium provides a wide range of features, including network policy enforcement, load balancing, DNS-based service discovery, and more, empowering organizations to build highly resilient and scalable infrastructure.
Migration Process:
Now, let's dive into the steps involved in migrating from Calico to Cilium CNI
Install the cilium binary on linux machine
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
Install Cilium
root@devmaster:~# cilium install --version 1.15.0
Verify the Cilium installation
Verify the Cilium Connectivity
root@devmaster:~# cilium connectivity test
ℹ️ Monitor aggregation detected, will skip some flow validation steps
✨ [kubernetes] Creating namespace cilium-test for connectivity check...
✨ [kubernetes] Deploying echo-same-node service...
✨ [kubernetes] Deploying DNS test server configmap...
✨ [kubernetes] Deploying same-node deployment...
✨ [kubernetes] Deploying client deployment...
✨ [kubernetes] Deploying client2 deployment...
✨ [kubernetes] Deploying client3 deployment...
✨ [kubernetes] Deploying echo-other-node service...
✨ [kubernetes] Deploying other-node deployment...
✨ [host-netns] Deploying kubernetes daemonset...
✨ [host-netns-non-cilium] Deploying kubernetes daemonset...
ℹ️ Skipping tests that require a node Without Cilium
⌛ [kubernetes] Waiting for deployment cilium-test/client to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/client2 to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/client3 to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for pod cilium-test/client-65847bf96-ctw2m to reach DNS server on cilium-test/echo-same-node-56dfd8bd85-hd72q pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-85585bdd-cjvnw to reach DNS server on cilium-test/echo-same-node-56dfd8bd85-hd72q pod...
⌛ [kubernetes] Waiting for pod cilium-test/client3-54d97dc775-rzlm6 to reach DNS server on cilium-test/echo-same-node-56dfd8bd85-hd72q pod...
⌛ [kubernetes] Waiting for pod cilium-test/client3-54d97dc775-rzlm6 to reach DNS server on cilium-test/echo-other-node-7b76c5bbf9-rzt5f pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-65847bf96-ctw2m to reach DNS server on cilium-test/echo-other-node-7b76c5bbf9-rzt5f pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-85585bdd-cjvnw to reach DNS server on cilium-test/echo-other-node-7b76c5bbf9-rzt5f pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-65847bf96-ctw2m to reach default/kubernetes service...
⌛ [kubernetes] Waiting for pod cilium-test/client2-85585bdd-cjvnw to reach default/kubernetes service...
⌛ [kubernetes] Waiting for pod cilium-test/client3-54d97dc775-rzlm6 to reach default/kubernetes service...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to be synchronized by Cilium pod kube-system/cilium-cwrxr
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to be synchronized by Cilium pod kube-system/cilium-jwpck
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to be synchronized by Cilium pod kube-system/cilium-cwrxr
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to be synchronized by Cilium pod kube-system/cilium-jwpck
⌛ [kubernetes] Waiting for NodePort 172.17.17.101:31784 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 172.17.17.101:31656 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.0.117:31784 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 192.168.0.117:31656 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 172.17.17.102:31784 (cilium-test/echo-other-node) to become ready...
⌛ [kubernetes] Waiting for NodePort 172.17.17.102:31656 (cilium-test/echo-same-node) to become ready...
⌛ [kubernetes] Waiting for DaemonSet cilium-test/host-netns-non-cilium to become ready...
⌛ [kubernetes] Waiting for DaemonSet cilium-test/host-netns to become ready...
ℹ️ Skipping IPCache check
🔭 Enabling Hubble telescope...
⚠️ Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:4245: connect: connection refused"
ℹ️ Expose Relay locally with:
cilium hubble enable
cilium hubble port-forward&
ℹ️ Cilium version: 1.15.0
🏃 Running 64 tests ...
Create a per-node config that will instruct Cilium to “take over” CNI networking on the node. Initially, this will apply to no nodes; you will roll it out gradually via the migration process.
root@devmaster:~# cat <<EOF | kubectl apply --server-side -f -
apiVersion: cilium.io/v2alpha1
kind: CiliumNodeConfig
metadata:
namespace: kube-system
name: cilium-default
spec:
nodeSelector:
matchLabels:
io.cilium.migration/cilium-default: "true"
defaults:
write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
custom-cni-conf: "false"
cni-chaining-mode: "none"
cni-exclusive: "true"
EOF
ciliumnodeconfig.cilium.io/cilium-default serverside-applied
Select a node to be migrated. It is not recommended to start with a control-plane node.
root@devmaster:~# NODE="devworker2.homecluster.store"
root@devmaster:~# kubectl cordon $NODE
node/devworker2.homecluster.store cordoned
root@devmaster:~# kubectl drain --ignore-daemonsets $NODE
node/devworker2.homecluster.store already cordoned
root@devmaster:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
devmaster.homecluster.store Ready control-plane 72d v1.26.0
devworker1.homecluster.store Ready <none> 72d v1.26.0
devworker2.homecluster.store Ready,SchedulingDisabled <none> 72d v1.26.0
Label the node. This causes the CiliumNodeConfig to apply to this node.
root@devmaster:~# kubectl label node $NODE --overwrite "io.cilium.migration/cilium-default=true"
node/devworker2.homecluster.store labeled
Restart Cilium. This will cause it to write its CNI configuration file.
root@devmaster:~# kubectl -n kube-system delete pod --field-selector spec.nodeName=$NODE -l k8s-app=cilium
pod "cilium-jvc7v" deleted
root@devmaster:~# kubectl -n kube-system rollout status ds/cilium -w
daemon set "cilium" successfully rolled out
Restart the Node
root@devmaster:~# ssh root@172.17.17.102
root@devworker2:~# init 6
Validate that the node has been successfully migrated.
root@devmaster:~# cilium status --wait
kubectl get -o wide node $NODE
kubectl -n kube-system run --attach --rm --restart=Never verify-network \
--overrides='{"spec": {"nodeName": "'$NODE'", "tolerations": [{"operator": "Exists"}]}}' \
--image ghcr.io/nicolaka/netshoot:v0.8 -- /bin/bash -c 'ip -br addr && curl -s -k https://$KUBERNETES_SERVICE_HOST/healthz && echo'
root@devmaster:~# kubectl uncordon $NODE
root@devmaster:~# kubectl get -o wide node $NODE
Once you are satisfied everything has been migrated successfully, select another unmigrated node in the cluster and repeat these steps.
Delete the Calico CNI
root@devmaster:~# kubectl delete crd $(kubectl get crd | grep calico | awk '{print $1}')
root@devmaster:~# kubectl delete -n kube-system deployment calico-kube-controllers
root@devmaster:~# kubectl delete -n kube-system daemonset calico-node
Verify Calico CNI get Deleted
root@devmaster:~# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
cilium-cwrxr 1/1 Running 0 15m
cilium-dr7r5 1/1 Running 0 5m39s
cilium-jwpck 1/1 Running 0 13m
cilium-operator-5b7fb7b87d-f2v97 1/1 Running 0 6m4s
coredns-787d4945fb-65hzf 1/1 Running 0 6m4s
coredns-787d4945fb-lfkw4 1/1 Running 0 38m
etcd-devmaster.homecluster.store 1/1 Running 4 (46d ago) 72d
kube-apiserver-devmaster.homecluster.store 1/1 Running 4 72d
kube-controller-manager-devmaster.homecluster.store 1/1 Running 13 (2d19h ago) 72d
kube-proxy-4gtbg 1/1 Running 2 (46d ago) 72d
kube-proxy-dh9z4 1/1 Running 2 (46d ago) 72d
kube-proxy-mtjjl 1/1 Running 3 (46d ago) 72d
kube-scheduler-devmaster.homecluster.store 1/1 Running 13 (2d19h ago) 72d
Reboot all the nodes and Check Nodes Status
root@devmaster:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
devmaster.homecluster.store Ready control-plane 72d v1.26.0
devworker1.homecluster.store Ready <none> 72d v1.26.0
devworker2.homecluster.store Ready <none> 72d v1.26.0
root@devmaster:~# kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
controller-586bfc6b59-pcq87 1/1 Running 8 (5m45s ago) 65d
speaker-2zwg4 1/1 Running 3 (3m7s ago) 72d
speaker-8n84l 1/1 Running 12 (6m36s ago) 72d
speaker-zdxv6 1/1 Running 3 (5m45s ago) 72d
Setting up Hubble Observability
root@devmaster# cilium hubble enable
root@devmaster# cilium status
Install the Hubble Client
root@devmaster:/etc/cni/net.d# HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
HUBBLE_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then HUBBLE_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
sha256sum --check hubble-linux-${HUBBLE_ARCH}.tar.gz.sha256sum
sudo tar xzvfC hubble-linux-${HUBBLE_ARCH}.tar.gz /usr/local/bin
rm hubble-linux-${HUBBLE_ARCH}.tar.gz{,.sha256sum}
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 17.0M 100 17.0M 0 0 3692k 0 0:00:04 0:00:04 --:--:-- 5802k
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 92 100 92 0 0 53 0 0:00:01 0:00:01 --:--:-- 219
hubble-linux-amd64.tar.gz: OK
hubble
Validate Hubble API Access
root@devmaster# cilium hubble port-forward&
Now you can validate that you can access the Hubble API via the installed CLI
root@devmaster:~# hubble status
Healthcheck (via localhost:4245): Ok
Current/Max Flows: 12,285/12,285 (100.00%)
Flows/s: 150.13
Connected Nodes: 3/3
Query the flow API and look for flows
root@devmaster:~# hubble observe
Enable the Hubble UI
root@devmaster:~# cilium hubble enable --ui
Posted on February 6, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.