Kyle Galbraith
Posted on June 28, 2022
As we went through in our recent post Fast Dockerfiles: theory and practice, it's important to write a Dockerfile that builds quickly. Once you have one, the next step is to actually build it in your CI environment like GitHub Actions, CircleCI, Travis CI, etc.
In this post, we will focus on how to build a Docker image as quickly as possible in GitHub Actions by leveraging layer caching. We will touch on what layer caching is, why it's important, and how we can leverage it in GitHub Actions to achieve faster builds.
Docker layer cache
A Docker layer is the output of running a step defined in your Dockerfile
. It is built off the previous layer before it (the parent) and contains the filesystem changes your step defined, files added, modified, or deleted. A final Docker image is just a series of Docker layers laid one after another, plus some associated metadata, as the build moves from the top of your Dockerfile to the bottom.
The benefit of caching the layers that make up a final image is that, rather than building them again, you can reuse layers that have not changed from previous builds. Needing to do less work in a build makes the build faster.
For a deeper dive into how layers get stacked up based on the contents of your Dockerfile, see our fast Dockerfiles theory and practice post.
Building Docker images in GitHub Actions
If you have never built a Docker image via GitHub Actions, this section is for you. If you already know how to build images with Actions, feel free to jump to the next section, where we discuss caching layers in Actions.
Building a Docker image via GitHub Actions requires creating a new folder and file in your repository. You need to create a .github
directory at the root of your repo, followed by a workflows
directory inside it. Then you are going to add a YAML file called ci.yml
.
mkdir .github
mkdir .github/workflows
touch .github/workflows/ci.yml
Inside the new YAML file, we'll define a job to build your Docker image.
name: Build Docker image
on:
push: {}
jobs:
build-with-docker:
name: Build with Docker
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
This is a complete GitHub Actions workflow that consists of a job called Build with Docker
that will build your image based on the Dockerfile defined at the root of your repository. Note that if your Dockerfile is in a subdirectory, you'll need to specify the path to the Dockerfile as an additional file
field.
The setup-buildx-action
configures Docker Buildx to create a builder instance for running the image build. The following step build-push-action
uses that instance to build your Docker image. The build-push-action
supports all of the features provided by BuildKit out of the box. In our simple example, we are only specifying the Docker context, but more advanced features like SSH, secrets, and build args are supported.
If you commit the new ci.yml
file, you should see a Docker build completed via GitHub Actions.
This is functional, and you can build images via GitHub Actions in addition to your local machine with this simple workflow. But if you run the build above more than once without making any code changes, you may notice a problem — build steps are recomputed every time, and every step in the Dockerfile needs to be re-run. With this basic workflow, we are not caching any Docker layers, so we must recompute each layer for every build.
Let's take a look at how we can add layer caching.
Docker layer caching in GitHub Actions
To cache the layers produced by a docker build
in GitHub Actions, we need to add a few more arguments to our build-push-action
step. We will add the cache-from
and cache-to
arguments.
name: Build Docker image
on:
push: {}
jobs:
build-with-docker:
name: Build with Docker
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
cache-from: type=gha
cache-to: type=gha,mode=max
Below our context, there is now cache-from
and cache-to
; both are configured to use the cache type gha
. This is an experimental cache exporter for GitHub Actions provided by buildx
and BuildKit. It uses the GitHub Cache API to fetch and load the Docker layer cache blobs across builds.
By adding remote cache loading and saving, you can reuse your Docker layers from previous builds — as we see with the CACHED
hits above. This is a nice improvement if you're building images in GitHub Actions, but it does come with limitations:
- The GitHub Cache API only supports a maximum size of 10 GB for the entire repository
- Loading and saving cache is network bound, meaning the loading and saving could negate any performance benefits of using the cached layers for simple image builds
- The cache is locked to GitHub Actions and can't be used in other systems or on local machines
A managed solution
We built Depot to eliminate the limitations above, not only in GitHub Actions but in all CI providers.
We manage a fleet of remote builders, supporting x86 and Arm architectures, tied to a project you provision in Depot. These remote builders come with higher specs than traditional CI provider VMs, with 16 CPUs, 32 GB memory, and a persistent 50 GB NVMe cache disk that can be expanded up to 500 GB.
You don't need to think about saving and loading the Docker layer cache, as we persist it for you across builds automatically via a local SSD. As it's saved to a local disk, it's available instantly during builds, with no need to save or load cached layers from the network. It's even shared with anyone who has access to the project, so a developer who runs a build locally can just reuse the cached layers that CI computed.
If you are interested in trying out Depot in your GitHub Actions workflow, check out our GitHub Actions integration guide.
Posted on June 28, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 20, 2024
June 17, 2023