AWS Cost Optimization: Periodic Deletion of ECR Container Images

siddhantkcode

Siddhant Khare

Posted on June 27, 2024

AWS Cost Optimization: Periodic Deletion of ECR Container Images

tl;dr;

Automated periodic deletion of ECR container images is a straightforward and effective way to optimize AWS costs. By leveraging Lambda functions and Step Functions, you can implement custom policies that meet your specific needs, ensuring that only necessary images are retained.


Introduction

Managing AWS costs can be challenging, especially with the increasing use of Elastic Container Registry (ECR) for storing container images. I've found that one effective way to cut costs is by periodically deleting unnecessary ECR container images. In this guide, I'll walk you through the steps to set up an automated cleanup process using Go.

Why Optimize ECR Storage?

ECR is a great tool for storing Docker container images, but as your CI/CD pipelines push more images, storage costs can quickly add up. Without regular cleanup, these costs can become significant. By implementing a strategy to automatically delete old or unused images, you can save money and keep your storage lean.

Using ECR Lifecycle Policies

ECR lifecycle policies are a built-in way to manage image cleanup. They allow you to set rules for automatically deleting images based on criteria such as age or tag. However, lifecycle policies have limitations, especially when you need to combine multiple conditions.

Challenges with ECR Lifecycle Policies

While ECR lifecycle policies provide a good starting point, they have limitations:

  1. Single Condition Policies: ECR lifecycle policies are designed to handle single-condition rules easily. For example, you can delete images older than a specific number of days or keep only the most recent N images. However, they struggle when you need to combine multiple conditions, such as "delete images older than X days and not among the latest N images."

  2. AND Conditions: The inability to use AND conditions in lifecycle policies means you can't create complex rules directly. For example, if you want to delete images that are older than 30 days and not part of the latest 10 images, you can't do this with a single lifecycle policy. You need a more sophisticated solution to handle such cases.

  3. Granular Control: Lifecycle policies provide limited control over the exact criteria used for image deletion. If your requirements are specific, such as retaining images based on custom tags or metadata, lifecycle policies may not suffice.

  4. Global vs. Repository-Specific Rules: Defining rules that apply globally to all repositories can be challenging. Lifecycle policies need to be set up for each repository individually, which can become cumbersome in environments with many repositories.

Custom Cleanup Solution

To overcome the limitations of lifecycle policies, we can use AWS Lambda functions and Step Functions to create a custom cleanup process. This approach offers more flexibility and control over which images get deleted.

Workflow Overview

Our custom solution involves the following steps:

  1. GetContainerRepositories Lambda Function: Retrieves a list of all ECR repositories in your AWS account.
  2. DeleteExpiredContainerImages-Map State: Processes each repository's image list.
  3. DeleteExpiredContainerImages Lambda Function: Evaluates and deletes images based on specified criteria.

Here's a visual representation of the workflow:

SFN State Machine

Implementation Details

Let's dive into the implementation of each step using Go.

  1. GetContainerRepositories: This Lambda function fetches a list of all ECR repositories and returns their details as JSON.


package main

import (
    "context"
    "log"

    "github.com/aws/aws-lambda-go/lambda"
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/ecr"
)

type ImageDetail struct {
    ImageDigest   string `json:"imageDigest"`
    ImagePushedAt string `json:"imagePushedAt"`
}

type Response struct {
    Images []ImageDetail `json:"images"`
}

func getImages(repositoryName string) ([]ImageDetail, error) {
    svc := ecr.New(session.New())
    var images []ImageDetail
    input := &ecr.DescribeImagesInput{
        RepositoryName: aws.String(repositoryName),
    }

    err := svc.DescribeImagesPages(input, func(page *ecr.DescribeImagesOutput, lastPage bool) bool {
        for _, image := range page.ImageDetails {
            images = append(images, ImageDetail{
                ImageDigest:   *image.ImageDigest,
                ImagePushedAt: image.ImagePushedAt.String(),
            })
        }
        return !lastPage
    })
    return images, err
}

func handleRequest(ctx context.Context) (Response, error) {
    repositoryName := "my-repository"
    images, err := getImages(repositoryName)
    if err != nil {
        return Response{}, err
    }
    return Response{Images: images}, nil
}

func main() {
    lambda.Start(handleRequest)
}


Enter fullscreen mode Exit fullscreen mode
  1. DeleteExpiredContainerImages-Map: This Map state iterates through each repository and invokes the DeleteExpiredContainerImages Lambda function.

  2. DeleteExpiredContainerImages: This Lambda function evaluates which images should be deleted based on criteria such as retaining the latest N images and those pushed within the last X days.



package main

import (
"context"
"time"

<span class="s">"github.com/aws/aws-lambda-go/lambda"</span>
<span class="s">"github.com/aws/aws-sdk-go/aws"</span>
<span class="s">"github.com/aws/aws-sdk-go/aws/session"</span>
<span class="s">"github.com/aws/aws-sdk-go/service/ecr"</span>
Enter fullscreen mode Exit fullscreen mode

)

type ImageDetail struct {
ImageDigest string json:"imageDigest"
ImagePushedAt time.Time json:"imagePushedAt"
}

type Request struct {
RepositoryName string json:"repositoryName"
Images []ImageDetail json:"images"
}

func filterExpiredImages(images []ImageDetail) []ImageDetail {
const (
retainImageCount = 10
retainSinceImagePushedDays = 30
)

<span class="k">var</span> <span class="n">toDelete</span> <span class="p">[]</span><span class="n">ImageDetail</span>
<span class="n">now</span> <span class="o">:=</span> <span class="n">time</span><span class="o">.</span><span class="n">Now</span><span class="p">()</span>
<span class="n">retainLimit</span> <span class="o">:=</span> <span class="n">now</span><span class="o">.</span><span class="n">AddDate</span><span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="o">-</span><span class="n">retainSinceImagePushedDays</span><span class="p">)</span>

<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">images</span><span class="p">)</span> <span class="o">&gt;</span> <span class="n">retainImageCount</span> <span class="p">{</span>
    <span class="n">images</span> <span class="o">=</span> <span class="n">images</span><span class="p">[</span><span class="o">:</span><span class="n">retainImageCount</span><span class="p">]</span>
<span class="p">}</span>

<span class="k">for</span> <span class="n">_</span><span class="p">,</span> <span class="n">image</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">images</span> <span class="p">{</span>
    <span class="k">if</span> <span class="n">image</span><span class="o">.</span><span class="n">ImagePushedAt</span><span class="o">.</span><span class="n">Before</span><span class="p">(</span><span class="n">retainLimit</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">toDelete</span> <span class="o">=</span> <span class="nb">append</span><span class="p">(</span><span class="n">toDelete</span><span class="p">,</span> <span class="n">image</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">toDelete</span>
Enter fullscreen mode Exit fullscreen mode

}

func deleteImages(svc ecr.ECR, repositoryName string, imageIds []string) error {
input := &ecr.BatchDeleteImageInput{
RepositoryName: aws.String(repositoryName),
ImageIds: make([]ecr.ImageIdentifier, 0, len(imageIds)),
}
for _, id := range imageIds {
input.ImageIds = append(input.ImageIds, &ecr.ImageIdentifier{ImageDigest: aws.String(id)})
}

<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">svc</span><span class="o">.</span><span class="n">BatchDeleteImage</span><span class="p">(</span><span class="n">input</span><span class="p">)</span>
<span class="k">return</span> <span class="n">err</span>
Enter fullscreen mode Exit fullscreen mode

}

func handleRequest(ctx context.Context, request Request) (string, error) {
svc := ecr.New(session.New())
toDelete := filterExpiredImages(request.Images)
var imageIds []string
for _, image := range toDelete {
imageIds = append(imageIds, image.ImageDigest)
}
err := deleteImages(svc, request.RepositoryName, imageIds)
if err != nil {
return "Failed to delete images", err
}
return "Successfully deleted images", nil
}

func main() {
lambda.Start(handleRequest)
}

Enter fullscreen mode Exit fullscreen mode




Periodic Triggers

To automate this process, schedule the Step Functions state machine using EventBridge rules. For instance, you can set it to run weekly on Friday nights.

Example Policies

Here are example policies showing both possible and not possible implementations:

Implementation Possible

Older than X days since push Included in latest N images? Action
Delete
Delete
Delete
Keep

Implementation Not Possible

Older than X days since push Included in latest N images? Action
Delete
Keep
Keep
Keep

Results

By implementing this periodic deletion strategy, you can significantly reduce your ECR storage costs. In my experience, this approach led to substantial savings, cutting unnecessary expenses and optimizing our AWS usage.

Thank you for reading, and happy optimizing!


For more tips and insights on security and log analysis, follow me on Twitter @Siddhant_K_code and stay updated with the latest & detailed tech content like this.

💖 💪 🙅 🚩
siddhantkcode
Siddhant Khare

Posted on June 27, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related