Simulating a Multi-Region CockroachDB cluster on Kubernetes with Minikube
Fabio Ghirardello
Posted on October 12, 2020
Minikube is a great platform to work on Kubernetes locally. It allows you to develop and test your apps on your own computer before you are ready to deploy them in a live cluster.
Following are instructions on how to create a 9 nodes CockroachDB cluster on Kubernetes, locally, using Minikube. This is very similar to the deployment we previously run using Docker, so check it out, too.
Below is the high level architecture. We create:
- 3
Services
of typeNodePort
, one for each region, so we can simulate accessing a particular region. - 1
Service
of typeClusterIP
, to allow all Pods to communicate with each other - 9
Pods
and 9PersistentVolumeClaims
objects:
Pod name | region | zone |
---|---|---|
roach-seattle-1 | us-west2 | a |
roach-seattle-2 | us-west2 | b |
roach-seattle-3 | us-west2 | c |
roach-newyork-1 | us-east4 | a |
roach-newyork-2 | us-east4 | b |
roach-newyork-3 | us-east4 | c |
roach-london-1 | eu-west2 | a |
roach-london-2 | eu-west2 | b |
roach-london-3 | eu-west2 | c |
- 2
Jobs
objects: one to init the cluster and one to run some SQL queries. - 3
NetworkChaos
objects, to simulate the latency between the regions.
Setup
Minikube resources
It's important to setup the Minikube VM with enough resources to run the cluster and the workload you'll be running on top of it. Everyone's environment is different, so this is for reference only.
Om my laptop, I have allocated 8 CPUs and 24 GB RAM to Minikube. Ensure you have a similar profile; the default 2 CPUs won't be sufficient to run the cluster flawlessly.
minikube config set cpus 8
minikube config set memory 24000
minikube config set driver hyperkit
minikube delete
Ensure the configuration is saved
$ minikube config view
- cpus: 8
- driver: hyperkit
- memory: 24000
All good, you can start Minikube with the new defaults
$ minikube start
๐ minikube v1.14.0 on Darwin 10.15.7
โจ Using the hyperkit driver based on user configuration
๐ Starting control plane node minikube in cluster minikube
๐ฅ Creating hyperkit VM (CPUs=8, Memory=24000MB, Disk=20000MB) ...
๐ณ Preparing Kubernetes v1.19.2 on Docker 19.03.12 ...
๐ Verifying Kubernetes components...
๐ Enabled addons: storage-provisioner, default-storageclass
๐ Done! kubectl is now configured to use "minikube" by default
Ensure Minikube and kubectl
are configured properly
$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
$ kubectl cluster-info
Kubernetes master is running at https://192.168.64.3:8443
KubeDNS is running at https://192.168.64.3:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
From above output, my Minikube cluster has address 192.168.64.3
. This will be helpful later when it comes to open the Admin UI and connect our SQL client to the CockroachDB cluster.
Please note: The Kubernetes definition files are hosted on Github in my Gist repository. We will use this repo later when we get to apply them.
The first step is to create the NodePort Services objects, one for each region, plus a ClusterIP Service so that all nodes can communicate with each other.
Here is the Service object definition for region us-west2
. Note the nodePort
parameter, which we use to expose the port externally.
---
# us-west2
apiVersion: v1
kind: Service
metadata:
name: us-west2
labels:
app: cockroachdb
spec:
type: NodePort
ports:
# SQL client port
- name: grpc
port: 26257
targetPort: 26257
nodePort: 31257
# Admin UI
- name: http
port: 8080
targetPort: 8080
nodePort: 31080
selector:
app: cockroachdb
region: us-west2
Below the intra-node service.
---
# intra-node service
apiVersion: v1
kind: Service
metadata:
name: cockroachdb
labels:
app: cockroachdb
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
prometheus.io/scrape: "true"
prometheus.io/path: "_status/vars"
prometheus.io/port: "8080"
spec:
ports:
- port: 26257
targetPort: 26257
name: grpc
- port: 8080
targetPort: 8080
name: http
publishNotReadyAddresses: true
clusterIP: None
selector:
app: cockroachdb
We then need the Pods and their PVCs objects.
Below is the definition of Pod roach-seattle-1
and its PVC object. Check the hostname
and subdomain
, and the command
used to start up the object - specifically the --locality
flag.
# roach-seattle-1
apiVersion: v1
kind: Pod
metadata:
name: roach-seattle-1
labels:
app: cockroachdb
region: us-west2
spec:
hostname: roach-seattle-1
subdomain: cockroachdb
containers:
- name: roach-seattle-1
image: cockroachdb/cockroach:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 26257
name: grpc
- containerPort: 8080
name: http
livenessProbe:
httpGet:
path: "/health"
port: http
initialDelaySeconds: 30
periodSeconds: 5
readinessProbe:
httpGet:
path: "/health?ready=1"
port: http
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 2
volumeMounts:
- name: datadir
mountPath: /cockroach/cockroach-data
env:
- name: COCKROACH_CHANNEL
value: kubernetes-insecure
- name: GOMAXPROCS
valueFrom:
resourceFieldRef:
resource: limits.cpu
divisor: "1"
- name: MEMORY_LIMIT_MIB
valueFrom:
resourceFieldRef:
resource: limits.memory
divisor: "1Mi"
command:
- "/bin/bash"
- "-ecx"
- exec
/cockroach/cockroach
start
--logtostderr
--insecure
--advertise-host $(hostname -f)
--http-addr 0.0.0.0
--join roach-seattle-1.cockroachdb,roach-newyork-1.cockroachdb,roach-london-1.cockroachdb
--cache $(expr $MEMORY_LIMIT_MIB / 4)MiB
--max-sql-memory $(expr $MEMORY_LIMIT_MIB / 4)MiB
--locality=region=us-west2,zone=a
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
persistentVolumeClaim:
claimName: roach-seattle-1-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: roach-seattle-1-data
labels:
app: cockroachdb
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
storageClassName: standard
resources:
requests:
storage: 1Gi
Finally, we have the Job objects.
The first Job object initiates the cluster
---
apiVersion: batch/v1
kind: Job
metadata:
name: cluster-init
labels:
app: cockroachdb
spec:
template:
spec:
containers:
- name: cluster-init
image: cockroachdb/cockroach:latest
imagePullPolicy: IfNotPresent
command:
- "/cockroach/cockroach"
- "init"
- "--insecure"
- "--host=roach-seattle-1.cockroachdb"
restartPolicy: OnFailure
The second Job object runs the SQL statements to set the geo coordinates of the regions, so the Map knows where to place them (provided you have an enterprise license).
---
apiVersion: batch/v1
kind: Job
metadata:
name: cluster-sql-init
labels:
app: cockroachdb
spec:
template:
spec:
containers:
- name: cluster-sql-init
image: cockroachdb/cockroach:latest
imagePullPolicy: IfNotPresent
command:
- "/cockroach/cockroach"
- "sql"
- "--insecure"
- "--url"
- "postgresql://roach-seattle-1.cockroachdb:26257/defaultdb?sslmode=disable"
- "-e"
- "UPSERT into system.locations VALUES ('region', 'us-east4', 37.478397, -76.453077), ('region', 'us-west2', 43.804133, -120.554201), ('region', 'eu-west2', 51.5073509, -0.1277583);"
restartPolicy: OnFailure
Now that we understand the required objects, apply the full yaml definition file to create the cluster
kubectl apply -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/crdb-k8s-cluster.yaml
After a few minutes, the cluster should be up and running, check the Pods are ready and in a running state:
$ kubectl get all --show-labels
NAME READY STATUS RESTARTS AGE LABELS
pod/cluster-init-665lj 0/1 Completed 0 118m controller-uid=ca1a2d5d-38db-48b6-834d-a5ffdbcb9ef8,job-name=cluster-init
pod/cluster-sql-init-7s69s 0/1 Completed 2 118m controller-uid=960f70fb-710d-4da5-89b5-b7af33cf913f,job-name=cluster-sql-init
pod/roach-london-1 1/1 Running 0 118m app=cockroachdb,region=eu-west2
pod/roach-london-2 1/1 Running 0 118m app=cockroachdb,region=eu-west2
pod/roach-london-3 1/1 Running 0 118m app=cockroachdb,region=eu-west2
pod/roach-newyork-1 1/1 Running 0 118m app=cockroachdb,region=us-east4
pod/roach-newyork-2 1/1 Running 0 118m app=cockroachdb,region=us-east4
pod/roach-newyork-3 1/1 Running 0 118m app=cockroachdb,region=us-east4
pod/roach-seattle-1 1/1 Running 0 118m app=cockroachdb,region=us-west2
pod/roach-seattle-2 1/1 Running 0 118m app=cockroachdb,region=us-west2
pod/roach-seattle-3 1/1 Running 0 118m app=cockroachdb,region=us-west2
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE LABELS
service/cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 118m app=cockroachdb
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 38d component=apiserver,provider=kubernetes
service/eu-west2 NodePort 10.103.129.120 <none> 26257:31259/TCP,8080:31280/TCP 118m app=cockroachdb
service/us-east4 NodePort 10.97.225.172 <none> 26257:31258/TCP,8080:31180/TCP 118m app=cockroachdb
service/us-west2 NodePort 10.109.153.233 <none> 26257:31257/TCP,8080:31080/TCP 118m app=cockroachdb
NAME COMPLETIONS DURATION AGE LABELS
job.batch/cluster-init 1/1 32s 118m app=cockroachdb
job.batch/cluster-sql-init 1/1 47s 118m app=cockroachdb
Latency Simulation
Simulate the latency between the regions. We set region latency as follows:
- us-west2 <=> us-east4: 60ms
- us-east4 <=> eu-west2: 120ms
- us-west2 <=> eu-west2: 180ms
We will use a special object for this, a NetworkChaos
object which you need to install first.
Install the NetworkChaos
object locally, check these instructions for Minikube.
Below is the definition of the NetworkChaos object to set the latency between the us-west2
and us-east4
nodes.
---
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: delay-uswest-useast
labels:
app: cockroachdb
spec:
action: delay # chaos action
mode: all
selector:
pods:
default: # namespace of the target pods
- roach-seattle-1
- roach-seattle-2
- roach-seattle-3
delay:
latency: "60ms"
direction: to
target:
selector:
pods:
default: # namespace of the target pods
- roach-newyork-1
- roach-newyork-2
- roach-newyork-3
mode: all
Create the NetworkChaos objects
kubectl apply -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/chaos.yaml
Confirm the NetworkChaos objects are up
$ kubectl get networkchaos.chaos-mesh.org
NAME AGE
delay-useast-euwest 78s
delay-uswest-euwest 78s
delay-uswest-useast 78s
At this point, retrieve from Minikube the URLs of the Services you created.
$ minikube service us-west2 --url
http://192.168.64.3:31257
http://192.168.64.3:31080
$ minikube service us-east4 --url
http://192.168.64.3:31258
http://192.168.64.3:31180
$ minikube service eu-west2 --url
http://192.168.64.3:31259
http://192.168.64.3:31280
As expected, each service returns 2 ports, one for the Admin UI, and one for the SQL client access.
In your browser, open the Admin UI using the URL with port 31080, 31180 or 31280. In my case, this is http://192.168.64.3:31080. Check the Network Latency page, confirm you see the latency between regions, but not within a region.
Very good, as expected!
Open a SQL shell. You can download the cockroachdb
binary which includes a built in SQL client or, thanks to CockroachDB's compliance with the PostgreSQL wire protocol, you can use the psql
client.
# ----------------------------
# ports mapping:
# 31257: us-west2
# 31258: us-east4
# 31259: eu-west2
# ----------------------------
# use cockroach sql
cockroach sql --insecure --url "postgresql://192.168.64.3:31257/defaultdb?sslmode=disable"
# or use psql
psql -h 192.168.64.3 -p 31257 -U root defaultdb
You will require an Enterprise license to unlock some of the features described below, like the Map view. You can request a Trial license or, alternatively, just skip the license registration - the deployment will still succeed.
Enter the license registration, if you have one
SET CLUSTER SETTING cluster.organization = "ABC Corp";
SET CLUSTER SETTING enterprise.license = "xxxx=yyyy-zzzz";
Confirm you are in the correct region as per port mappings:
SHOW LOCALITY;
locality
--------------------------
region=us-west2,zone=b
(1 row)
Time: 759ยตs
Verify you can see the latency between nodes. In this example, we connect to region eu-west2
and from within the container, we initiate a SQL connection to the database using one of the Seattle nodes. From there, we issue a simple query and verify the latency.
Connect to the node in London
kubectl exec -it roach-london-1 -- bash
Once in the London container, connect to the CockroachDB cluster at the Seattle node.
cockroach sql --insecure --url "postgresql://roach-seattle-1.cockroachdb:26257/defaultdb?sslmode=disable"
At the SQL prompt, ask the Seattle node to tell you its locality and check the Time
it takes to execute. The query is trivial, so the Time will basically reflect the latency.
SHOW LOCALITY;
locality
--------------------------
region=us-west2,zone=a
(1 row)
Time: 181.064515ms
As expected, we get ~180ms latency, which is what we have setup using the NetworkChaos
objects. You can do the same exercise for the other nodes, too.
Congratulations! The cluster is ready for your development work!
Clean up
To delete the deployment, simply delete the deployment definition files.
kubectl delete -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/crdb-k8s-cluster.yaml
kubectl delete -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/chaos.yaml
To uninstall the NetworkChaos
object, follow these instructions.
References
Posted on October 12, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
October 12, 2020