Kubernetes Part 2: A Guide to Application Deployment, Autoscaling, and Rollout Strategies
Salam Shaik
Posted on November 19, 2024
Hi everyone,
This is the second installment of the Kubernetes series. You can find the previous article using the below link
Introduction to Kubernetes and AWS EKS — Part 1
Tasks we are going to do in this article:
Create a docker image of a web app and upload it to AWS ECR
Pulling docker image from AWS ECR and deploying it in EKS Cluster
Scaling your application basic on traffic
Different deployment strategies
Let’s dive into the article
Creating a docker image and uploading to AWS ECR:
I am using an open-source HTML/CSS website in this article as a deployment application
You can download the application using this link codewithsadee/portfolio: Fully responsive personal portfolio website
This is a web application. We’re gonna deploy this in our Kubernetes cluster
For deploying it, we need to make a docker image of this app
-
So clone the project using the below command
-
Create a Dockerfile with below code at the root directory of the application
FROM nginx:alpine
COPY . /usr/share/nginx/html
EXPOSE 80 Here I am using the nginx: alpine image for the application deployment. We will serve the web application using the Nginx server
The second line will copy every file from the current folder to the path specified. Here it is /usr/share/nginx/html
This is the default path of nginx server, it will serve the files inside that folder in port 80
The third line will expose port 80 from the container
Let’s build the docker image and upload it to the AWS ECR
Visit the ECR service from the AWS search bar
Click on the Create Repository button
Provide a name for the repository, keep the remaining fields as it is, and click on Create
Once the repo is created. In the top right corner, you can see the button view push commands. Click on it and follow the instructions one by one
- After running all the commands you can see the list of the images you uploaded like this
Deploying docker image to EKS cluster from ECR:
Create an EKS cluster and a node group
You can find how to create an EKS cluster and a node group in my previous article mentioned at the start of the article
The only change I am doing here is, I am changing the EC2 machine type from t3.medium to t3.small in the node group
Keeping the Desired, Minimum nodes as 1 and Maximum size of 2.
-
Point your kubectl to use the EKS cluster we created using the following command
aws eks --region <region-code> update-kubeconfig --name <cluster-name>
-
To check whether the kubectl is correctly configured or not. Use the following commands
kubectl get nodes
The command will return the nodes list created in the cluster.
Installing the metric server for Auto Scaling:
Metric server is a light-weight Kubernetes add-on that provides resource usage metrics
-
Use the below command to install the metric server in the cluster
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Once the metric server is ready, Let’s deploy the application
Create a Deployment.yaml file with the following code. This code will deploy our application in the cluster
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample
spec:
replicas: 1 # Start with one replica
selector:
matchLabels:
app: sample
template:
metadata:
labels:
app: sample
spec:
containers:
- name: my-app-container
image: <account-id>.dkr.ecr.us-east-1.amazonaws.com/sample:latest
ports:
- containerPort: 80
resources:
requests:
cpu: "100m" # Minimum CPU required for a pod
limits:
cpu: "500m" # Maximum CPU allocated for a pod
- Run the following command to deploy the application
kubectl apply -f deployment.yaml
Let’s create a service load balancer to expose the application outside of the cluster
Create another file with the name service.yaml which will create a load balancer that exposes 80 port to outside and points incoming traffic to the pod 80 port
apiVersion: v1
kind: Service
metadata:
name: sample-service
spec:
type: LoadBalancer
selector:
app: sample
ports:
- protocol: TCP
port: 80
targetPort: 80
-
Once service is also deployed. Use the below command to see the list of the services deployed
kubectl get services
Output will look something like this
Copy the External IP of the service we deployed and try to access the applications using the browser. It should display the web page of the application we deployed
If it’s not loading wait till the load balancer comes online and is in a working state
Scaling the deployment:
Horizontal Pod AutoScaler(HPA):
It is a Kubernetes feature that helps in adjusting the pod replicas based on metrics like CPU usage
This will use the metric server for collecting the metrics, and based on those metrics it will try to adjust the pod replicas
Create a file name scale.yaml file with the following code
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: sample-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50f
This will deploy the pod auto-scaler with our deployment as target and with minimum replicas as 1 and max replicas as 10 based on CPU utilization reaching more than 50%
-
Deploy this HPA using the following command
kubectl apply -f scale.yaml
-
Wait for a few minutes till it is deployed. After that run the following command to check the HPA deployment
kubectl get hpa
The output should look like this
Let’s hit the endpoint with multiple requests and try to see whether auto-scaling works or not
I am using J-meter to create the requests
I am creating 10k requests in 10 seconds to increase the CPU utilization so that auto-scaling will trigger and try to create more replicas
Use the following commands to check the CPU-Utilization percentage while creating the requests
kubectl get hpa
- Once it is in peaks you can see Utilization reached more than the threshold of 50%
- Let’s see how many pods are running using kubectl get pods command
You can see that 3 pods are running to manage the traffic. Wait for at least five minutes, so that it will auto-scale down the pod replicas
After 5 minutes I ran the kubectl get pods command again to check the pods count
- That’s it. Like this, we can manage the scaling up/down of the pod replicas for managing the incoming traffic
Deployment Strategies:
- Kubernetes supports different types of deployments for rolling out updates
Some of the methods are
Rolling update(Default):
- It will replace the pods gradually for a smooth transition
Recreate:
- It will terminate all the current running pods and then create new updated pods
Canary Deployment:
- It will deploy the new version to the subset of the user. Like redirecting some of the users to the new deployment and keeping the other users to the old deployment until everything is stable
Blue-green deployment:
- Run two complete environments blue for the old and green for the new. Once everything is verified in the green environment traffic will be redirected to the Green environment from the blue environment
These are all the different deployment methods that Kubernetes supports for rolling out the updates
Will try to practice these deployments in the upcoming article. That’s it for this article
Please let me know if there are any mistakes or suggestions to improve. I am open to suggestions
Thanks and Have a good day
Note: Before leaving try to delete the cluster, node group, and the load balancer you created to avoid un-necessary charges
Posted on November 19, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.