☸️ FinOps EKS: 10 Tips to Reduce the Bill up to 90% on AWS Managed Kubernetes Clusters
Benoit COUETIL 💫
Posted on April 20, 2021
- Initial thoughts
- 1. Install the Cluster Autoscaler
- 2. Define Spot Instances nodes in a Launch Template
- 3. Automatically switch off outside working hours
- 4. Reduce the number of zones in the region
- 5. Use the EC2s with the best performance / price ratio
- 6. Combine environments in the same cluster
- 7. Use Horizontal Pod Autoscaler
- 8. Use Vertical Pod Autoscaler
- 9. Use a single load balancer with an Ingress Controller
- 10. Maintain about 5-6 nodes on average
- Wrapping up
- Further reading
Initial thoughts
Cloud computing allows rationalization of infrastructure costs. That being said, when you start, the bill can quickly go up. EKS, the Kubernetes managed service from AWS, is no exception. Here are some tips to help you reduce your costs up to 90%, without lowering the level of service.
The first element to take into consideration is human: avoid manual infrastructure modifications. So, first of all, use automation tools such as Terraform, otherwise you will lower your AWS bill, but it might cost way more in salary. With Terraform, installing all of the infrastructure associated with EKS, including VPC, takes approximately 15 minutes. Each update advocated in this article, if it concerns the infrastructure, would then only take a few minutes.
Most of these generic aspects can be applied, at least in philosophy, to other Cloud providers.
1. Install the Cluster Autoscaler
One of the most obvious ways to ensure that you only provision the necessary resources is to install the Cluster Autoscaler.
It is mainly a pod in the cluster that will monitor the requested resources and will provision or delete the nodes (compute VMs) as needed.
The configuration has to be adjusted according to the context. Here is an example below. Note the forced eviction of system pods having local storage, for really efficient downscale:
extraArgs:
# scale-down-utilization-threshold: 0.5 # default
scan-interval: 30s # default 10s
# scale-down-delay-after-add: 10 # default
scale-down-unneeded-time: 20m # default 10m
scale-down-unready-time: 5m # default 20m
skip-nodes-with-local-storage: false
skip-nodes-with-system-pods: false
2. Define Spot Instances nodes in a Launch Template
Spot Instances are EC2s (AWS VMs) but less expensive (up to 90%). This is AWS selling off its unused resources, but they are preemptible, the user runs the risk of losing the EC2 in question if AWS falls short of on-demand resources of the same type. It is no longer a price to the highest bidder for several years: It is a price that changes slightly, depending on long-term demand.
A Kubernetes cluster is ideal for using this kind of preemptible resource: it is able to handle errors and self-repair by provisioning other VMs for its nodes. If your applications respect cloud patterns such as 12-Factor App methodology, you can safely use that.
The risk of losing your EC2s is low but real: over 4 months of using T3 on eu-west-1 (Ireland), 2 days of unavailability of this type were observed in a zone. How to mitigate this problem? By creating a pool of EC2s in the form of a Launch Template: you can ask for exactly 2 types of instance among a set of types, which AWS orders by price. If an inexpensive type is unavailable, AWS will automatically provision a slightly more expensive type.
Example of a zonal worker_groups_launch_template under Terraform:
worker_groups_launch_template = [
{
name = "spot-az-a"
subnets = [module.vpc.private_subnets[0]] # only one subnet to simplify PV usage
on_demand_base_capacity = "0"
# on_demand_percentage_above_base_capacity = 0 # If not set, all new nodes will be spot instances
override_instance_types = ["t3a.xlarge", "t3.xlarge", "t2.xlarge", "m4.xlarge", "m5.xlarge", "m5a.xlarge"]
spot_allocation_strategy = "lowest-price"
spot_instance_pools = 2 # "Number of Spot pools per availability zone to allocate capacity. EC2 Auto Scaling selects the cheapest Spot pools and evenly allocates Spot capacity across the number of Spot pools that you specify."
asg_desired_capacity = "1"
asg_min_size = "0"
asg_max_size = "10"
key_name = var.cluster_name
kubelet_extra_args = "--node-labels=lifecycle=spot"
}
3. Automatically switch off outside working hours
The production environment is probably used 24/7, but usually the development environment is only used during business hours, 1/3 of the time. By default, unless there is heavy activity during the day, the cost of the cluster will be about the same all the time, because RAM is consumed even without activity.
To drastically reduce the number of nodes in non-working hours, simply install the kube-downscaler in the cluster. The principle is simple: at the times indicated it reduces deployments and statefulsets to 0 pods, except in certain configurable namespaces. The drastic reduction in the number of pods will cause the Cluster Autoscaler to automatically delete unused nodes.
Another advantage: as the nodes have a lifespan of a few hours, the reserved disk space can usually be reduced from 100 GB to 20 GB, allowing a very slight additional saving.
4. Reduce the number of zones in the region
Inside AWS, network traffic within a zone is free, but charged between zones (between data centers).
There is an average of 3 zones per region. In practice, it is a surplus of high availability that is not necessary. If there are 2 unavailable areas at the same time, there is a good chance that the problem is larger and that the third is also unreachable…
For structures of modest size, it is possible to reduce to 2 zones in development or even in production, depending on the necessary high availability. With substantial savings on network transfers.
5. Use the EC2s with the best performance / price ratio
Some information about EC2s (AWS VMs):
T2, T3, M4, M5, etc. Numbers denote generations. Generation 5 are generally more efficient in benchmarks
Tn instances have AWS Nitro technology, which is supposed to provide up to 60% performance at equal specs, but in practice the benchmarks are not that convincing
Tn instances work with credits. By default, consuming CPU over a long period of time increases the cost. And it is not possible to exceed 100% of the allocated CPU, it is indeed the prolonged use that is subject to credit.
Here is a price comparison in March 2021 for eu-west-3 (Paris), on instances that can contain 58 pods, with type, Spot Instances price and discount:
4 CPU / 32 Go RAM
r5a.xlarge 0,07$/h (-74% on 0,27$/h)
r5ad.xlarge 0,07$/h (-77% on 0.31$/h)
r5d.xlarge 0,07$/h (-79% on 0.34$/h)
r5.xlarge 0,09$/h (-70% on 0,30$/h)
8 CPU / 32 Go RAM
t3a.2xlarge 0,10$/h (-71% on 0,34$/h)
t3.2xlarge 0,11$/h (-71% on 0,38$/h)
m5a.2xlarge 0,13$/h (-67% on 0,40$/h)
t2.2xlarge 0,13$/h (-69% on 0,42$/h)
m5.2xlarge 0,13$/h (-71% on 0,45$/h)
m5ad.2xlarge 0,13$/h (-73% on 0,48$/h)
m5d.2xlarge 0,13$/h (-75% on 0,53$/h)
8 CPU / 16 Go RAM
c5.2xlarge 0,12$/h (-70% on 0,40$/h)
c5d.2xlarge 0,12$/h (-74% on 0,46$/h)
Some resulting observations:
All these Spot Instances are more or less the same price, except the r5, cheaper
CPU is a variable that significantly raises the bill
If CPU requirement is low (which is often the case in the development phase), it is better to use r5x
Note that if the infrastructure is provisioned by Terraform, changing the type of server is painless: applying the change will not remove the EC2s in place, it is the new ones that will have the new type.
6. Combine environments in the same cluster
If your software delivery is based on a flow involving multiple branches such as Gitflow, several levels of environment will be necessary: possibly one environment per feature branch, one environment for the develop branch, one environment per release branch, and a production environment represented by the master branch.
If the delivery is more mature and organized in trunk-based-development, at least a production environment and a staging environment (pre-production / recipe / iso-production).
In either scenario, it is possible to organize to have only two clusters (dev & prod), or three (dev, staging & prod).
Grouping several environments in the same cluster shares the monitoring tools while separating the application into namespaces. On the data manager side (databases, messengers), it is better for them to be separated between environments. For instance, we can provision for each environment a managed DB outside the cluster for the most advanced environments (staging / production), and on the other hand integrate it into the Kubernetes cluster for the ephemeral environments of the feature branches. There are now Helm charts available for most DBs, which are easy to install and quick to instantiate.
A little warning though: It is an obvious source of Cloud economy, but if this aspect induces a daily waste of time due to centralization, it is ultimately a false good idea, human power being much more expensive than machine power nowadays.
7. Use Horizontal Pod Autoscaler
The Horizontal Pod Autoscaler (HPA) is a standard Kubernetes object allowing to automatically manage the number of pods (identical) of an application according to the actual activity (CPU, RAM or custom), between n and m pods.
It will therefore be more economical to define an HPA between 1 and 10 pods knowing that the maximum activity requires 10 pods, rather than defining the number of replicas systematically to 10. This will drastically reduce over-reservation on environments, especially outside production.
8. Use Vertical Pod Autoscaler
In some cases, horizontal pod scalability is not an option. Particularly for database statefulsets, search engine, messengers or cache. Why not just leave the pods with a low reservation, which would then consume resources at an excessively high limit? Because this puts an uncontrolled competition between the pods of the same node, and especially does not imply reorganization of pods.
Rather than defining CPU / RAM requirements corresponding to peak loads, it makes more sense to use the Vertical Pod Autoscaler. Thus no need to reserve important resources for each pod of our database, the increase and decrease in reservations will be done according to the activity. This will drastically reduce overbooking in environments, especially outside production.
Read about this in the excellent article Vertical Pod Autoscaling: The Definitive Guide.
9. Use a single load balancer with an Ingress Controller
When creating a service Kubernetes, the type can be ClusterIP, NodePort, LoadBalancer, or ExternalName. If it is of the LoadBalancer type, a device will be reserved to ensure load balancing.
To avoid this expensive (and luxurious) operation, it is better to define the services in ClusterIP, and to define ingress, managed by an ingress controller like Traefik, created by a former Zenika employee, or the one based on NGINX. It will provision a single load balancer for all ingresses, and therefore all kubernetes services.
10. Maintain about 5-6 nodes on average
The more nodes there are in a cluster, the more high availability is ensured, but the more system resources are consumed (incompressible or linked to daemonsets), the greater the chance of having insufficient resources to put the next pods ; available resources which, taken together, would be sufficient. The CPU will also be less shared, CPU which is more prone to peaks in consumption than memory.
The fewer nodes there are in a cluster, the more it is possible to avoid the previous problems. But the availability is lower, and a new node, if underutilized, represents a large percentage of loss.
The middle ground between high availability and resource efficiency would be, from experience, around 5-6 nodes. So with 12 nodes on EC2 xlarge, opt for EC2 2xlarge, which will let the Autoscaler adjust to 6 nodes, maybe 5 if the distribution of resources was unfavorable.
Wrapping up
We have detailed 10 ways to shrink the AWS bill, without compromising on resiliency and availability, thanks mostly to different autoscaling mechanisms.
Convinced to use AWS for your managed Kubernetes cluster ? Deploy your clusters in minutes with How to deploy a cost-efficient AWS/EKS Kubernetes cluster using Terraform in 2023.
You can go further with optimizing your cluster costs by picking generic advice from FinOps EKS: 10 tips to reduce the bill up to 90% on AWS managed Kubernetes clusters.
Further reading
☸️ Kubernetes: A Convenient Variable Substitution Mechanism for Kustomize
Benoit COUETIL 💫 for Zenika ・ Aug 4
☸️ Why Managed Kubernetes is a Viable Solution for Even Modest but Actively Developed Applications
Benoit COUETIL 💫 for Zenika ・ Jun 5
☸️ Kubernetes: From Your Docker-Compose File to a Cluster with Kompose
Benoit COUETIL 💫 for Zenika ・ Mar 9
☸️ Kubernetes: A Pragmatic Kubectl Aliases Collection
Benoit COUETIL 💫 for Zenika ・ Jan 6
☸️ Web Application on Kubernetes: A Tutorial to Observability with the Elastic Stack
Benoit COUETIL 💫 for Zenika ・ Nov 27 '23
☸️ Kubernetes NGINX Ingress Controller: 10+ Complementary Configurations for Web Applications
Benoit COUETIL 💫 for Zenika ・ Oct 16 '23
☸️ Kubernetes: Awesome Maintained Links You Will Keep Using Next Year
Benoit COUETIL 💫 for Zenika ・ Sep 4 '23
☸️ Managed Kubernetes: Our Dev is on AWS, Our Prod is on OVHCloud
Benoit COUETIL 💫 for Zenika ・ Jul 1 '23
Posted on April 20, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.