yayugu
Posted on February 6, 2020
Background
For modern, compact images, docker build
on Circle CI is not slow. However, for some reason, you may need to build a relatively large image, which takes from 5 to 10 minutes. In my case, I worked on to containerize a monolithic PHP service. I am going to explain how to boost the building time.
Precondition
- I won't explain the best practices for creating images with dockers in general here. Of course, those are important, so please do it first.
- This article contains a bit of bad know-how. It's up to you whether using or not.
- If you build a Jenkins server by yourself, everything is properly cached and faster. But I didn't want to do that, so I used Circle CI.
1. Use Machine Executor
Let's start by using Machine Executor. Container Executor puts the Docker Daemon on a separate machine. COPY
ing the whole repository tends to be slow. Also, you won't be able to do some hacks, such as mounting a local directory (see below).
version: 2.1
jobs:
build-image:
machine: true
steps:
- checkout
- run:
name: Build a container
command: |
docker login .......
docker build --progress plain -t $IMAGE_NAME .
docker push $IMAGE_NAME
2. Use BuildKit
BuildKit is a new system for Docker with a lot of optimizations. The default docker version of Machine Executor is out of date, so you must specify the latest image to use BuildKit. It also needs to specify the environment variable (DOCKER_BUILDKIT=1
) at execution time.
version: 2.1
jobs:
build-image:
machine:
image: ubuntu-1604:201903-01
steps:
- checkout
- run:
name: Build a container
command: |
docker login .......
DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
docker push $IMAGE_NAME
3. Use Docker Layer Caching
Docker Layer Caching (DLC) persists the directories for the caching and mounts them when the Executor runs. DLC saves a lot of time because downloading can be omitted when pulling the base image (specified in the FROM of the Dockerfile) is large. However, the mounting process takes 5 to 10 seconds when the Executor is started, So when the base image is small, it is faster not to use DLC.
version: 2.1
jobs:
build-image:
machine:
image: ubuntu-1604:201903-01
docker_layer_caching: true
steps:
- checkout
- run:
name: Build a container
command: |
docker login .......
DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
docker push $IMAGE_NAME
4. Do vendoring on the same job/executor
Vendoring means something like npm install
, bundle install
, or composer install
. There are several ways to do this on CircleCI (But each one has its drawbacks.):
- Execute as an upstream job in Workflow and pass-through by Workspace or Cache
- Executors start for each Job, and each takes up to 20 to 30 seconds.
- Do in the Dockerfile
- On CircleCI, Docker's native caching mechanism is difficult to use.
So, mounting the local directory to the docker, run vendoring, and cache the result by CircleCI. This strategy is effective.
version: 2.1
jobs:
build-image:
machine:
image: ubuntu-1604:201903-01
docker_layer_caching: true
steps:
- checkout
- restore_cache:
keys:
- vendoring-{{ checksum "composer.lock" }}
- run:
name: Vendoring
command: |
docker login .......
docker run --rm -v /home/circleci/repo:/home/circleci/repo \
${BASE_IMAGE} \
bash -c \
"cd /home/circleci/repo && \
php composer.phar config --global github-oauth.github.com ${GITHUB_TOKEN} && \
php composer.phar install --prefer-dist --ignore-platform-reqs --no-scripts --no-dev"
- save_cache:
key: vendoring-{{ checksum "composer.lock" }}
paths:
- vendor
- run:
name: Build a container
command: |
DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
docker push $IMAGE_NAME
It's fast. But if you look at the code, you'll see that this lost the readability. It is also possible to run multiple processes in parallel, but I don't recommend it. It is too complicated.
5. Using shallow-checkout
This technique has not related to docker, but another way to boost the Job is to speed up git checkout. The Circle CI standard checkout clones the entire git repository. The time can be reduced by shallow checking out. It only pulls the state of the current commit. For some reason, Machine Executors tend to have a slower git checkout than Container Executors, so this technique is more useful for Machine Executor.
(the command is written directly in the .circleci.yml, but using Orb may be more maintainable if you can use the Orb feature.
version: 2.1
jobs:
build-image:
machine:
image: ubuntu-1604:201903-01
docker_layer_caching: true
steps:
- shallow-checkout
- restore_cache:
keys:
- vendoring-{{ checksum "composer.lock" }}
- run:
name: Vendoring
command: |
docker login .......
docker run --rm -v /home/circleci/repo:/home/circleci/repo \
${BASE_IMAGE} \
bash -c \
"cd /home/circleci/repo && \
php composer.phar config --global github-oauth.github.com ${GITHUB_TOKEN} && \
php composer.phar install --prefer-dist --ignore-platform-reqs --no-scripts --no-dev"
- save_cache:
key: vendoring-{{ checksum "composer.lock" }}
paths:
- vendor
- run:
name: Build a container
command: |
DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
docker push $IMAGE_NAME
commands:
shallow-checkout:
description: "from: https://circleci.com/orbs/registry/orb/datacamp/shallow-checkout"
steps:
- run:
name: Shallow checkout
command: |
set -e
# Workaround old docker images with incorrect $HOME
# check https://github.com/docker/docker/issues/2968 for details
if [ "${HOME}" = "/" ]
then
export HOME=$(getent passwd $(id -un) | cut -d: -f6)
fi
mkdir -p ~/.ssh
# Please fill the value from your execution log of `checkout`
echo 'github.com ssh-rsa XXXXXXXXXXXXXXXX
' >> ~/.ssh/known_hosts
(umask 077; touch ~/.ssh/id_rsa)
chmod 0600 ~/.ssh/id_rsa
(echo $CHECKOUT_KEY > ~/.ssh/id_rsa)
# use git+ssh instead of https
git config --global url."ssh://git@github.com".insteadOf "https://github.com" || true
git config --global gc.auto 0 || true
mkdir -p $CIRCLE_WORKING_DIRECTORY
cd $CIRCLE_WORKING_DIRECTORY
if [ -n "$CIRCLE_TAG" ]
then
git clone --depth=1 -b "$CIRCLE_TAG" "$CIRCLE_REPOSITORY_URL" .
else
git clone --depth=1 -b "$CIRCLE_BRANCH" "$CIRCLE_REPOSITORY_URL" .
fi
git fetch --depth=1 --force origin "$CIRCLE_SHA1" || echo "Git version >2.5 not installed"
if [ -n "$CIRCLE_TAG" ]
then
git reset --hard "$CIRCLE_SHA1"
git checkout -q "$CIRCLE_TAG"
elif [ -n "$CIRCLE_BRANCH" ]
then
git reset --hard "$CIRCLE_SHA1"
git checkout -q -B "$CIRCLE_BRANCH"
fi
git reset --hard "$CIRCLE_SHA1"
Summary
In my environment, when a commit is pushed to a Pull Request, it is automatically deployed to the Dev environment on the Kubernetes, Then the building time is critical. With these speedups, the building image only takes around 90 seconds compared to the 5+ minutes for a 500MB image. So far, so good to us. But if you optimize too much, you will lose readability and maintainability, keeping a good balance!
Posted on February 6, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.