Quick tips for the faster container image building on CircleCI

yayugu

yayugu

Posted on February 6, 2020

Quick tips for the faster container image building on CircleCI

Background

For modern, compact images, docker build on Circle CI is not slow. However, for some reason, you may need to build a relatively large image, which takes from 5 to 10 minutes. In my case, I worked on to containerize a monolithic PHP service. I am going to explain how to boost the building time.

Precondition

  • I won't explain the best practices for creating images with dockers in general here. Of course, those are important, so please do it first.
  • This article contains a bit of bad know-how. It's up to you whether using or not.
  • If you build a Jenkins server by yourself, everything is properly cached and faster. But I didn't want to do that, so I used Circle CI.

1. Use Machine Executor

Let's start by using Machine Executor. Container Executor puts the Docker Daemon on a separate machine. COPYing the whole repository tends to be slow. Also, you won't be able to do some hacks, such as mounting a local directory (see below).

version: 2.1
jobs:
  build-image:
    machine: true
    steps:
      - checkout
      - run:
          name: Build a container
          command: |
            docker login .......
            docker build --progress plain -t $IMAGE_NAME .
            docker push $IMAGE_NAME

2. Use BuildKit

BuildKit is a new system for Docker with a lot of optimizations. The default docker version of Machine Executor is out of date, so you must specify the latest image to use BuildKit. It also needs to specify the environment variable (DOCKER_BUILDKIT=1) at execution time.

version: 2.1
jobs:
  build-image:
    machine:
      image: ubuntu-1604:201903-01
    steps:
      - checkout
      - run:
          name: Build a container
          command: |
            docker login .......
            DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
            docker push $IMAGE_NAME

3. Use Docker Layer Caching

Docker Layer Caching (DLC) persists the directories for the caching and mounts them when the Executor runs. DLC saves a lot of time because downloading can be omitted when pulling the base image (specified in the FROM of the Dockerfile) is large. However, the mounting process takes 5 to 10 seconds when the Executor is started, So when the base image is small, it is faster not to use DLC.

version: 2.1
jobs:
  build-image:
    machine:
      image: ubuntu-1604:201903-01
      docker_layer_caching: true
    steps:
      - checkout
      - run:
          name: Build a container
          command: |
            docker login .......
            DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
            docker push $IMAGE_NAME

4. Do vendoring on the same job/executor

Vendoring means something like npm install, bundle install, or composer install. There are several ways to do this on CircleCI (But each one has its drawbacks.):

  • Execute as an upstream job in Workflow and pass-through by Workspace or Cache
    • Executors start for each Job, and each takes up to 20 to 30 seconds.
  • Do in the Dockerfile
    • On CircleCI, Docker's native caching mechanism is difficult to use.

So, mounting the local directory to the docker, run vendoring, and cache the result by CircleCI. This strategy is effective.

version: 2.1
jobs:
  build-image:
    machine:
      image: ubuntu-1604:201903-01
      docker_layer_caching: true
    steps:
      - checkout
      - restore_cache:
          keys:
            - vendoring-{{ checksum "composer.lock" }}
      - run:
          name: Vendoring
          command: |
            docker login .......
            docker run --rm -v /home/circleci/repo:/home/circleci/repo \
                ${BASE_IMAGE} \
                bash -c \
                "cd /home/circleci/repo && \
                php composer.phar config --global github-oauth.github.com ${GITHUB_TOKEN} && \
                php composer.phar install --prefer-dist --ignore-platform-reqs --no-scripts --no-dev"
      - save_cache:
          key: vendoring-{{ checksum "composer.lock" }}
          paths:
            - vendor
      - run:
          name: Build a container
          command: |
            DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
            docker push $IMAGE_NAME

It's fast. But if you look at the code, you'll see that this lost the readability. It is also possible to run multiple processes in parallel, but I don't recommend it. It is too complicated.

5. Using shallow-checkout

This technique has not related to docker, but another way to boost the Job is to speed up git checkout. The Circle CI standard checkout clones the entire git repository. The time can be reduced by shallow checking out. It only pulls the state of the current commit. For some reason, Machine Executors tend to have a slower git checkout than Container Executors, so this technique is more useful for Machine Executor.

(the command is written directly in the .circleci.yml, but using Orb may be more maintainable if you can use the Orb feature.

version: 2.1
jobs:
  build-image:
    machine:
      image: ubuntu-1604:201903-01
      docker_layer_caching: true
    steps:
      - shallow-checkout
      - restore_cache:
          keys:
            - vendoring-{{ checksum "composer.lock" }}
      - run:
          name: Vendoring
          command: |
            docker login .......
            docker run --rm -v /home/circleci/repo:/home/circleci/repo \
                ${BASE_IMAGE} \
                bash -c \
                "cd /home/circleci/repo && \
                php composer.phar config --global github-oauth.github.com ${GITHUB_TOKEN} && \
                php composer.phar install --prefer-dist --ignore-platform-reqs --no-scripts --no-dev"
      - save_cache:
          key: vendoring-{{ checksum "composer.lock" }}
          paths:
            - vendor
      - run:
          name: Build a container
          command: |
            DOCKER_BUILDKIT=1 docker build --progress plain -t $IMAGE_NAME .
            docker push $IMAGE_NAME
commands:
  shallow-checkout:
    description: "from: https://circleci.com/orbs/registry/orb/datacamp/shallow-checkout"
    steps:
      - run:
          name: Shallow checkout
          command: |
            set -e

            # Workaround old docker images with incorrect $HOME
            # check https://github.com/docker/docker/issues/2968 for details
            if [ "${HOME}" = "/" ]
            then
              export HOME=$(getent passwd $(id -un) | cut -d: -f6)
            fi

            mkdir -p ~/.ssh
            # Please fill the value from your execution log of `checkout`
            echo 'github.com ssh-rsa XXXXXXXXXXXXXXXX 
            ' >> ~/.ssh/known_hosts

            (umask 077; touch ~/.ssh/id_rsa)
            chmod 0600 ~/.ssh/id_rsa
            (echo $CHECKOUT_KEY > ~/.ssh/id_rsa)

            # use git+ssh instead of https
            git config --global url."ssh://git@github.com".insteadOf "https://github.com" || true
            git config --global gc.auto 0 || true

            mkdir -p $CIRCLE_WORKING_DIRECTORY
            cd $CIRCLE_WORKING_DIRECTORY

            if [ -n "$CIRCLE_TAG" ]
            then
              git clone --depth=1 -b "$CIRCLE_TAG" "$CIRCLE_REPOSITORY_URL" .
            else
              git clone --depth=1 -b "$CIRCLE_BRANCH" "$CIRCLE_REPOSITORY_URL" .
            fi
            git fetch --depth=1 --force origin "$CIRCLE_SHA1" || echo "Git version >2.5 not installed"

            if [ -n "$CIRCLE_TAG" ]
            then
              git reset --hard "$CIRCLE_SHA1"
              git checkout -q "$CIRCLE_TAG"
            elif [ -n "$CIRCLE_BRANCH" ]
            then
              git reset --hard "$CIRCLE_SHA1"
              git checkout -q -B "$CIRCLE_BRANCH"
            fi

            git reset --hard "$CIRCLE_SHA1"

Summary

In my environment, when a commit is pushed to a Pull Request, it is automatically deployed to the Dev environment on the Kubernetes, Then the building time is critical. With these speedups, the building image only takes around 90 seconds compared to the 5+ minutes for a 500MB image. So far, so good to us. But if you optimize too much, you will lose readability and maintainability, keeping a good balance!

💖 💪 🙅 🚩
yayugu
yayugu

Posted on February 6, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related