By Rajesh Gheware

In today's rapidly evolving digital landscape, organizations are increasingly leveraging cloud technologies to drive innovation, scale operations, and reduce overheads. Amazon Web Services (AWS) Elastic Kubernetes Service (EKS) has emerged as a pivotal solution for deploying, managing, and scaling containerized applications with Kubernetes. However, as the cloud environment grows, so does the complexity of managing costs. Kubernetes autoscaling within AWS EKS presents a strategic approach to optimizing cloud expenses while ensuring resource availability for fluctuating workloads. This article delves into effective strategies for Kubernetes autoscaling, focusing on the latest Kubernetes version 1.28, to help you minimize costs and maximize efficiency.

Understanding Kubernetes Autoscaling in AWS EKS

Kubernetes autoscaling on AWS EKS encompasses two key components: Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler (CA). HPA adjusts the number of pods in a deployment or replica set based on observed CPU utilization or custom metrics. On the other hand, CA automatically adjusts the size of an AWS EKS cluster to meet the current demand for pods.

Horizontal Pod Autoscaler (HPA)

HPA dynamically scales the number of pods in a deployment or replica set based on observed CPU utilization or custom metrics provided by the Metrics Server or custom metrics APIs. Here's a simple HPA example:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-application-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-application
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This configuration automatically scales the my-application deployment between 1 and 10 replicas to maintain an average CPU utilization of 50%.

Cluster Autoscaler (CA)

Cluster Autoscaler adjusts the size of your AWS EKS cluster automatically. It increases the number of nodes during high load and decreases them during low usage, optimizing your cloud costs. Ensure your AWS EKS cluster IAM roles and policies are correctly configured for CA to function properly. Here’s how you can enable CA in your cluster:

Annotate your AWS node groups to allow autoscaling:

   kubectl annotate deployment my-application cluster-autoscaler.kubernetes.io/safe-to-evict="true"

Deploy the Cluster Autoscaler:

   apiVersion: apps/v1
   kind: Deployment
   metadata:
     name: cluster-autoscaler
     namespace: kube-system
     labels:
       app: cluster-autoscaler
   spec:
     replicas: 1
     selector:
       matchLabels:
         app: cluster-autoscaler
     template:
       metadata:
         labels:
           app: cluster-autoscaler
       spec:
         containers:
         - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.28.0
           name: cluster-autoscaler
           resources:
             limits:
               cpu: 100m
               memory: 300Mi
             requests:
               cpu: 100m
               memory: 300Mi
           command:
           - ./cluster-autoscaler
           - --v=4
           - --stderrthreshold=info
           - --cloud-provider=aws
           - --skip-nodes-with-local-storage=false
           - --expander=least-waste
           - --nodes=1:10:my-node-group

Ensure to replace my-node-group with your actual node group name.

Best Practices for Kubernetes Autoscaling

Monitor and Adjust Metrics: Regularly review the metrics used for autoscaling to ensure they accurately reflect your application's performance and user experience requirements.
Implement Resource Requests and Limits: Define resource requests and limits for your pods to ensure efficient autoscaling and prevent resource contention.
Use Multiple Metrics: When configuring HPA, consider using multiple metrics (CPU, memory, custom metrics) to make more informed scaling decisions.
Test Your Autoscaling: Regularly test autoscaling to ensure it reacts correctly under different load conditions.
Leverage Cluster Autoscaler Profiles: Utilize CA profiles to optimize for specific scenarios, such as optimizing for cost or minimizing application disruption.

Leveraging Karpenter for Efficient Autoscaling

Karpenter is an open-source autoscaling project designed for Kubernetes clusters, offering a more flexible and efficient alternative to the traditional Cluster Autoscaler. It works by rapidly launching right-sized instances based on application demands and efficiently packing pods onto nodes to reduce costs and improve performance. Integrating Karpenter with AWS EKS can further optimize your Kubernetes autoscaling strategy.

Key Features of Karpenter

Rapid Scaling: Karpenter can provision new nodes within seconds, ensuring that your applications scale up efficiently in response to increased demand.
Cost-Effective Resource Allocation: By optimizing the packing of pods onto nodes, Karpenter reduces waste and lowers costs by selecting the most cost-effective instances for the workload.
Flexible Scheduling: Unlike traditional autoscalers that rely on predefined node groups, Karpenter makes scheduling decisions based on real-time workload requirements, leading to better resource utilization.

Setting Up Karpenter on AWS EKS

To integrate Karpenter with your Kubernetes 1.28 cluster on AWS EKS, follow these steps:

Install Karpenter on Your Cluster: First, install Karpenter in your cluster. This involves deploying the Karpenter controller and configuring the necessary IAM roles and policies for it to interact with AWS services.

   helm repo add karpenter https://charts.karpenter.sh/
   helm repo update
   helm install karpenter karpenter/karpenter --namespace karpenter --create-namespace --version 0.5.0 --set serviceAccount.create=true --set clusterName=my-cluster

Configure Provisioning: Next, define a Provisioner resource that tells Karpenter how to make decisions about node provisioning and scaling.

   apiVersion: karpenter.sh/v1alpha5
   kind: Provisioner
   metadata:
     name: default
   spec:
     requirements:
       - key: "kubernetes.io/arch"
         operator: In
         values: ["amd64"]
       - key: "topology.kubernetes.io/zone"
         operator: In
         values: ["us-west-2a", "us-west-2b", "us-west-2c"]
     ttlSecondsAfterEmpty: 300
     limits:
       resources:
         cpu: 1000
     provider:
       instanceProfile: KarpenterNodeInstanceProfile
       subnetSelector:
         kubernetes.io/cluster/my-cluster: "*"
       securityGroupSelector:
         kubernetes.io/cluster/my-cluster: "*"

This configuration specifies the requirements for nodes that Karpenter will provision, such as the architecture, availability zones, and resource limits. It also defines a TTL for empty nodes, encouraging cost savings by terminating nodes that are not in use.

Best Practices for Using Karpenter

Define Clear Resource Requirements: To maximize the efficiency of Karpenter, clearly define the CPU and memory requirements for your pods. This ensures Karpenter can make informed decisions about the instances to provision.
Utilize Spot Instances: Take advantage of Karpenter's ability to provision spot instances for non-critical workloads to further reduce costs.
Monitor Your Workloads: Regularly monitor your workloads and Karpenter's decisions to ensure they align with your cost and performance objectives. Adjust your Provisioner specifications as needed.

By incorporating Karpenter into your AWS EKS autoscaling strategy, you can achieve more efficient, cost-effective, and responsive scaling of your Kubernetes applications. Karpenter's flexibility, speed, and cost-saving potential make it a valuable tool for optimizing your cloud infrastructure. As you continue to develop your Kubernetes expertise, exploring tools like Karpenter can help you maintain an edge in managing scalable, resilient, and cost-efficient cloud-native applications.

Conclusion

Optimizing cloud costs while ensuring application availability and performance is a critical concern for businesses leveraging Kubernetes on AWS EKS. Implementing Kubernetes autoscaling through HPA and CA offers a robust strategy for dynamically adjusting resources in response to workload demands. By following the strategies outlined in this article and employing best practices, organizations can achieve cost-effective, efficient, and scalable cloud infrastructure.

As Kubernetes continues to evolve, staying updated on the latest features and

practices is essential for maximizing the benefits of your cloud environment. Engage with the community, leverage the wealth of available resources, and continue to innovate within your Kubernetes deployments to stay ahead in the fast-paced world of cloud computing.

Blog

Optimize Your Cloud Costs: Strategies for Kubernetes Autoscaling

Rajesh Gheware