Google Kubernetes Engine, CircleCI and Traefik for a full-fledged GitOps platform in the cloud - Part 1
Romain Vernoux
Posted on January 15, 2021
As teams start new projects, they usually waste precious time deploying and configuring a CI/CD pipeline from scratch. At Zenika Labs, our goal is to deliver proofs of concept or minimum viable products as efficiently as possible, without compromising on quality.
Our teams usually work on short-lived (from a few minutes to a few hours) feature branches, with a strong focus on technical/functional exploration and quick iterations with the Product Owner. We expect our infrastructure to be able todeploy a new version of the product in a few minutes, but also to dynamically deploy an instance for each active featurebranch on each Git push. These production-like instances, accessible from anywhere (web) and at any time, are targeted by automated end-to-end tests, used by the Product Owner to try new features, and sometimes showed to the end users to validate or invalidate new concepts and ideas.
We work on a large range of technical stacks and on very diverse products, from static websites to complex event-drivenmicroservices architectures. We need to deploy both stateless and stateful workloads, some very light such as Node.js backends, others more compute or data-intensive such as Kafka clusters. Build processes also vary from trivial to very complex in the case of large microservice architectures in a monorepo.
In any case, our job is not to build or maintain infrastructures, but rather to deliver software. This is why we want toreuse most of our tooling across projects and need language, architecture and size-agnostic services. Moreover, theprice of most managed services for CI/CD are so low compared to a developer daily cost that there is no actual reason for us not to use them extensively and focus our precious time on more useful work.
This guide describes how to set up, in probably less than an hour, the infrastructure supporting the development workflow we use every day to build, test and deploy our projects. For all the reasons listed above and after a lot of investigation, we settled on Google's managed Kubernetes (GKE) as well as CircleCI, Traefik and other Google Cloud services.
What is Kubernetes?
Kubernetes, also known as k8s, is an open-source system for automating deployment, scaling, and management of containerized applications. In the past years, Kubernetes has become the de-facto industrial standard to deploy containers on-premise or in the cloud.
We use a shared, autoscaling Kubernetes cluster as an all-purpose (and now quite standard) deployment target. Each of our projects has its own namespace, with resources quotas et closed network boundaries.
If you have never used Kubernetes before, this guide will probably feel a bit too hard to follow. You may start by reading a bit about Kubernetes first.
What is CircleCI?
CircleCI is a cloud-native continuous integration and continuous delivery (CI/CD) platform. It integrates with GitHub and Bitbucket and runs a configured pipeline on each commit. Think Jenkins multibranch pipelines on steroids, in the cloud, and fully managed for you.
We chose CircleCI as a managed, modern and reliable alternative to Jenkins and prefer it over TravisCI or GitHub Actions for its best-in-class performance and ability to configure and run workflows on large polyglot monorepos requiring advanced caching mechanisms.
If you have never used CircleCI before, welcome aboard and enjoy the free plan!
What is Traefik?
Traefik Proxy is a dynamic, modern, open-source Edge Router that automatically inspects your infrastructure to discover services and how to route traffic to them. Traefik is natively compliant with every major cluster technology, such as Kubernetes, Docker, Docker Swarm, AWS, Mesos, Marathon... and even bare metal! Used as an ingress controller in Kubernetes, it is probably a drop-in replacement for the one you already use (if any), and brings awesome features such as automated TLS certificate management via Let's Encrypt, middlewares, plugins...
Traefik is the cornerstone of our platform, allowing new instances to be deployed and made accessible over https
without any human intervention.
If you have never used Traefik before, welcome aboard and enjoy the ride (you will)!
The following guide contains some data and commands gathered from the following resources:
Read them in details if you want to deep dive into how things work.
Architecture and outline
Our pipeline uses GitHub for version control, CircleCI for CI/CD, a Google Kubernetes Engine cluster as the deployment platform, Traefik as ingress controller and load balancer and Let's Encrypt for a fully automated and free TLS certificate management. The corresponding architecture is pictured below:
Everytime a developers pushes a new feature branch on GitHub, the platform builds it and deploys it to an isolated, short-lived environment (App 1
, ..., App n
in the picture above). This new, separate instance of the app is made accessible on the web with a dedicated URL and a TLS certificate, allowing the team and our users to test it in a production-like environment. In this setup master
in just another instance of the app which can act as an integration environment.
The following sections will describe how to assemble step by step the different components pictured above in order to reproduce our pipeline on your own environments in about an hour.
Requirements
For that, you will need
- a Google Cloud account, with full access rights to create and configure a Kubernetes cluster
- a GitHub account, with ability to create and configure projects
- a CircleCI account linked to your GitHub account, with ability to configure and run builds
- to own a domain (e.g.,
mywebsite.com
) managed by one of the providers compatible with Traefik
The following command-line tools must be installed and configured on your computer:
- gcloud SDK
- kubectl
- htpasswd (
sudo apt-get install htapasswd
) - envsubst (
sudo apt-get install gettext
)
Login in gcloud (this command should open a page in your browser asking for access to your Google account):
gcloud auth login
Finally, download the descriptors and resources from our GitHub repository.
1. Kubernetes cluster
Deploy a GKE cluster instance on GCP
This can easily be done through the Cloud Console. The following instructions do not assume a particular configuration or size for your cluster, except for the HTTP load balancing add-on, which must be enabled after creation.
Authenticate to the cluster:
gcloud container clusters get-credentials [CLUSTER_NAME]
2. Traefik
Create a "traefik" namespace in the cluster
kubectl create namespace traefik
Import Traefik's Custom Resource Definitions (CRDs) in the cluster
Apply the CRD descriptor:
kubectl apply -f crd.yaml
Apply the Role-Based Access Control (RBAC) rules required by Traefik
Apply the RBAC descriptor:
kubectl apply -f rbac.yaml
This will create the RBAC rules, create a service account for Traefik and bind the rules to the service account.
Deploy Traefik in the cluster
Create a secret file with a pair user / password hash. These will be the credentials to use to access Traefik's dashboard.
htpasswd -bc [FILENAME] [USER] [PASSWORD]
Import the secret into your cluster's traefik namespace:
kubectl create secret generic traefik-auth --from-file [FILENAME] --namespace=traefik
In order for Traefik to generate wildcard TLS certificates using Let's Encrypt, it must fulfill a DNS challenge. Since our domain is registered with Google Domains and our DNS is handled by Google Cloud DNS, we use Traefik's Google Cloud provider to do so (other providers are listed here). This provider requires the key of a GCP Service Account with DNS write access to edit DNS records. This service account and its key can be generated through the Cloud Console.
Note that if you opt for another provider, you will probably need to adapt or remove the volume
, volumeMount
and environment variables parts of the traefik descriptor in order to pass the correct configuration to Traefik.
Import the key file into your cluster's traefik namespace as a secret with key traefik-service-account
:
kubectl create secret generic traefik-service-account --from-file=traefik-service-account.json=[FILENAME] --namespace=traefik
Fill the following placeholders in the traefik descriptor:
-
DOMAIN
: the domain you own (e.g.,mywebsite.com
) -
ACME_EMAIL_ADDRESS
: the contact email address to use to generate the TLS certificates -
GCE_PROJECT
: the name of the Google Cloud project
Then apply it:
kubectl apply -f traefik.yaml
This will:
- instantiate a Traefik instance using a Deployment
- expose this Traefik instance on a public IP using a Service of type LoadBalancer
- configure Traefik's entrypoints to listen to ports 80 (HTTP) and 443 (HTTPS)
- redirect all HTTP (port 80) traffic to the HTTPS entrypoint (port 443) using a RedirectScheme middleware
- expose Traefik's dashboard on the
traefik
subdomain (e.g.,traefik.mywebsite.com
) using an IngressRoute, protected with a BasicAuth middleware (using the secret created above) - configure a Traefik certificate resolver to generate wildcard certificates on demand
- create and use the wildcard TLS certificate (e.g.,
*.mywebsite.com
) required by the dashboard IngressRoute
Wait for a bit and get the public IP associated by GKE to the Traefik Service:
kubectl -n traefik get services
The IP will eventually be displayed in the "EXTERNAL-IP" column, but it may take a few seconds.
Configure your DNS records manually to redirect all traffic from your domain to this IP (this is an A
record from *.mywebsite.com
to the external IP).
Traefik's dashboard should now be accessible on the traefik subdomain (e.g., traefik.mywebsite.com
) and all HTTP traffic should be redirected to HTTPS with valid Let's Encrypt certificates.
End of part 1!
Stay tuned for part 2 and the deployment of your first app on your brand new CI/CD platform!
Posted on January 15, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
January 15, 2021