Improving build times with Docker Layer Caching

moeedk

moeed-k

Posted on March 23, 2023

Improving build times with Docker Layer Caching

An important concept in container image building is layer-caching. It is important to understand in order to optimize your build times and to streamline your CI/CD workflows.

In this post, we'll be taking a look at three different aspects of layer-caching:

  1. Optimizing dockerfiles to re-use the maximum number of layers.
  2. Layer-caching between two different images using --cache-from.
  3. Using BuildKit's inline cache.

Understanding Layers

Let's take a look at an example of a simple dockerfile:

# pull base image
FROM ubuntu:latest

# install wget
RUN apt-get update && \
    apt-get -y install wget
Enter fullscreen mode Exit fullscreen mode

Here, we can observe two layers. The first one pulls the base image, and the second one installs wget.

With layer caching, our aim is to build a minimal amount of layers in each build, while re-using previous layers as much as possible. In essence, we're trying to reduce the number of steps in our build process.

Optimizing Dockerfiles to re-use layers

If we build the simple dockerfile we took as an example above, it will take a few seconds for the base image to be pulled. If we change the second layer to:

# install wget and nginx
RUN apt-get update && \
    apt-get -y install wget && \
    apt-get -y nginx
Enter fullscreen mode Exit fullscreen mode

And then re-build, we will see that only the second layer will be run again, and the base image will not be pulled again. This is because the first layer has been cached and is being re-used.

However, if we change our first layer, we will be invalidating all subsequent layers that come after it.

# pull different image
FROM ubuntu:18.04

# install wget nginx
RUN apt-get update && \
    apt-get -y install wget && \
    apt-get -y nginx
Enter fullscreen mode Exit fullscreen mode

This time, all layers after the first layer will be built again.

Keeping this in mind, we should always try to follow the two following best practices:

  1. Keep commands that are not likely to change at the start of the dockerfile.
  2. Commands that are going to change often should ideally be near the end of the dockerfile.

Using layer-caching between Images

If there is an image that has already been built and it shares some of the layers with your own dockerfile, you can use that image as part of the build cache with the help of the --cache-from flag.

For example:

IMG="my-image"

# Pull an existing image
docker pull ${IMG}:old-ver

docker build --cache-from ${IMG}:old-ver -t ${IMG}:new-ver .
Enter fullscreen mode Exit fullscreen mode

BuildKit Inline Cache

The problem with the above approach is that it requires us pulling an image from a remote registry first. By using BuildKit's inline cache, we can cache images to our local registry in order to avoid expensive pull operations.

export DOCKER_BUILDKIT=1

# Build and cache image
$ docker build --build-arg BUILDKIT_INLINE_CACHE=1 .

Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
moeedk
moeed-k

Posted on March 23, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related