Dockerfile Reference

First of all, this is just the summary that I did from the original Dockerfile refernece found on Docker website. Then, let me admit that I am pretty new to Docker. I have been playing around with it for the last few days. One thing I must admit is that the learning curve for Docker was pretty steep for me. There were a lot of concepts to learn. However, the prospective benefits of using it were far greater. One of the key benefits of Docker I found was ability to deploy applications in different types of platforms without worring much about it. This also gave me the opportunity to share my code with my friends who would be able to run the application with minimum effort.

If you are not familiar to Docker and it's concepts, this might not be the article for you. This article summarizes the references required for writing the Dockerfile. The official documentation can be found the Docker website. This is however my "go-to" document for quick reference.

Purpose of Dockerfile

Docker uses Dockerfile to build images automatically. It is a text file without any extensions that contains all the commands that a user can call on the command line to build an image. The command docker build builds an image from two things:

Dockerfile and
context: it is processed recursively. The context is a set of files at specified location. While building the image, entire context is sent to the Docker daemon. The context can be either:
- PATH: a directory in the local filesystem. This can include subdirectories
- URL: a Git repository location. This can include repository and submodules To use a file in the build context, Dockerfile refers to the file specified in an instruction (ie. COPY). We can use a .dockerignore file to improve the build performance by excluding some of the files or directories.

Convention: Dockerfile is usually called Dockerfile and located at the root of the context. However, -f flag can be used with docker build to specify another Dockerfile. For example:

docker build -f /path/to/a/Dockerfile .

Each instruction in the Dockerfile is run separately to create a new image. However, Docker will reuse intermediate images (cache) to accelerate build process, which is indicated by the Using cache message in the output console.

Format of Instructions

Commands in Dockerfile is written in instruction-arguments method. So, there will be an instruction and along with it will be some arguments. Instructions are NOT case-sensitive but convention is to use uppercase.

must begin with a FROM instruction
Comments start with # sign and are removed from the file before executing the commands
leading whitespace are ignored but discouraged

        # this is a comment-line
    RUN echo hello
RUN echo world

Parser directives must be at the top of the file; before the FROM instruction

Environment Replacement

Environment Variables can be used in certain instructions as variables to be interpreted by Dockerfile. Environment variable can be notated in Dockerfile with $variable_name or ${variable_name}. However, latter is applicable in case like: ${foo}_bar. Environment Variable is supported by: ADD, COPY, ENV, EXPOSE, FROM, LABEL,STOPSIGNAL, USER, VOLUME, WORKDIR, and ONBUILD. Example:

FROM busybox
ENV FOO=/bar
WORKDIR ${FOO}   # WORKDIR /bar
ADD . $FOO       # ADD . /bar
COPY \$FOO /quux # COPY $FOO /quux

`.dockerignore` file

Before sending context to docker daemon, docker CLI looks for .dockerignore file in the root of context. If it exists, CLI excludes the files and directories. CLI interprets the file as newline-separated list of patterns. Rules of .dockerignore file:

Rule	Behavior
`#text`	This is considered comment
`/temp`	Excludes file and directories starting with 'temp' in immediate subdirectory of root
`//temp*`	Excludes file and directories starting with 'temp' in two levels below root
`temp?`	Excludes files and directores starting with 'temp' that are in root
`*.md`	Exclude all markdown files
`!README.md`	Exclude all file except `README.md`

Note: if Dockerfile and .dockerignore are added to the .dockerignore file, they NOT copied to the image but are still sent to the daemon.

`FROM` Instruction

This instruciton initializes a new build stage and sets the Base Image for the subsequent instructions. A valid Dockerfile must start with a FROM instruction.

ARG is the only instruction that may precede FROM
FROM can appear multiple times in one Dockerfile to create multiple images or use one build stage as dependency for another. Each FROM instruction clears any state created by previous instructions
name can be used in subsequent FROM or COPY --from=<name> instructions
tag or digest values are optional. By default, builder assumes latest.
Examples:

FROM [--platform=<platform>] <image> [AS <name>]
or
FROM [--platform=<platform>] <image>[:<tag>] [AS <name>]
or
FROM [--platform=<platform>] <image>[@<digest>] [AS <name>]

`ARG` and `FROM` working together

FROM supports variables that are declared by ARG instruction preceding the FROM. Example:

ARG CODE_VERSION=latest
FROM base:${CODE_VERSION}
CMD /code/run-app

from extras:${CODE_VERSION}
CMD /code/run-extras

ARG declared before FROM is usually not available to instructions after the FROM. To make it available, we need to add the ARG instruction and variable name after the FROM without a value. Example:

ARG VERSION=latest
FROM busybox:$VERSION
ARG VERSION
RUN echo $VERSION > image_version

`RUN` instruction

This instruction runs the commands in a new layer on the current image and commits the results. It has two forms:

RUN <command> (shell form):
- Backslash \ can be used to continue writing commands in the following line
- Default shell can be changed using the SHELL command
RUN ["executable", "param1", "param2"] (exec form):
- Double quotes must be used for commands passed as string
- To change the default shell pass the target shell in first command: RUN ["/bin/bash", "-c", "echo hello"]
- Variable substitution doesn't happen by default. For variable substitution: RUN [ "sh", "-c", "echo $HOME" ]

`CMD` instruction

The main purpose for this instruction is the provide defaults for an execurint container. These defaults can include or exclude the executable. If it excludes, then must specify an ENTRYPOINT instruction. There can be only one CMD instruction in a Dockerfile. If there are multiple CMD instructions, the last one will take effect. This instruction has three forms:

CMD command param1 param2 (shell form):
CMD ["executable","param1","param2"] (exec form): Similar to exec form of RUN. This is the preferred format for CMD
CMD ["param1","param2"] (default parameters to ENTRYPOINT): If you would like your container to run the same executable every time, then you should consider using ENTRYPOINT in combination with CMD.

Note: RUN vs CMD. RUN actually runs a command and commits the result; CMD does not execute anything at build time, but specifies the intended command for the image.

`LABEL` instruction

This instruction adds metadata to an image. The instruction adds a key-value pair to the image. Multiple labels can be added under one single instruction. Example:

LABEL multi.label1="value1" multi.label2="value2" other="value3"

LABELs in the parent images are inherited in the subsequent images. If a label exists with multiple values, the latest one override all previous values.

`EXPOSE` instruction

This instruction informs that the container listens to a specific network port at runtime.
TCP or UDP can be specified but TCP is default.
Doesn't actually publish the port but works as a documentation between the person building image and person running container
use -p flag on docker run to publish and map one or more ports: docker run -p 80:80/tcp -p 80:80/udp
use -P flag to publish all exposed ports and map them to high-order ports
Example:

EXPOSE 80/tcp
EXPOSE 80/udp

docker network command supports creating networks for communication among containers without exposing ports

`ENV` instuction

This instruction sets environment variable that will be available in all subsequent instuction in the build stage.
Example:

ENV MY_NAME="John Doe"
ENV MY_DOG=Rex\ The\ Dog
ENV MY_CAT=fluffy

Allows multiple key-value variables to be set at one time:

ENV MY_NAME="John Doe" MY_DOG=Rex\ The\ Dog \
    MY_CAT=fluffy

Environment variables set using this instruction will persist when a container is run from the resulting image. Check the values using docker inspect; change values using docker run -env <key>=<value>
Environment variable persistence can lead to side effects
If environment variable is needed only during build, use either of the following methods:
- set value of a single command: RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y ...
- use ARG:
```
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y ...
```
Alternative syntax (doesn't allow setting multiple variables in one line):

ENV MY_VAR my-value

`ADD` instruction

This instruction copies new files/directories from source to the desination
It has two forms:
1. ADD [--chown=<user>:<group>] <src>... <dest>
2. ADD [--chown=<user>:<group>] ["<src>",... "<dest>"]

Note: --chown is supported only for Linux containers.

This instruction takes local files, URLs, and tar files as source
Multiple sources may be specified but there address will be interpreted as relative to the source of context of the build
Sources may contain wildcards
Destination is absolute path or path relative to WORKDIR (where the sources will be copied)

ADD test.txt relativeDir/       # copies test.txt to <WORKDIR>/relativeDir/
ADD test.txt /absoluteDir/      # copies test.txt to /absoluteDir/

All new files and directories are created with UID and GID of 0 (unless --chown mentioned)

ADD --chown=55:mygroup files* /somedir/
ADD --chown=bin files* /somedir/
ADD --chown=1 files* /somedir/
ADD --chown=10:11 files* /somedir/

source directory must be inside the context of the build; you cannot ADD ../something /something, because the first step of a docker build is to send the context directory (and subdirectories) to the docker daemon.
if source is a directory, the entire contents of the directory are copied including filesystem metadata (the directory itself is not copied).
if source is a local tar archive in identity/gzip/bzip2/xy format then it is unpacked as a directory with tar -x behavior.
if multiple sources are specified, destination must be a directory and must end with forward slash /
if destination does not end with a trailing slash, it will be considered a regular file and the contents of source will be written at destination.
if desitnaton doesn't exist, it is created along with all missing directories

`COPY` instruction

This is similar to ADD instuction but takes only local files as source and copies to docker image.
ADD vs COPY

`ENTRYPOINT` instruction

This allows us to configure a container that will run as an executable.
This has two forms:

1. `ENTRYPOINT ["executable", "param1", "param2"]` (exec form): `docker run <image>` will be appended after all elements and override all elements specified using `CMD`. 
2. `ENTRYPOINT command param1 param2` (shell form): prevents `CMD` or `run` command line arguments to be passed. `ENTRYPOINT` will be started as a subcommand of `/bin/sh -c`, which doesn't pass signals. This means that the executable will be the container's `PID 1` and will not receive unix signals.

Only the last ENTRYPOINT will have effect

How `CMD` and `ENTRYPOINT` interact

Both of these instructions define what command gets executed when running a container. The rules regarding interaction of these two are:

Dockerfile should specify at least one of CMD or ENTRYPOINT
ENTRYPOINT should be defined when using container as an executable
CMD should be used to define default arguments for an ENTRYPOINT or executing ad-hoc commands
CMD will be overriden when container is run with alternative arguments

The combinations of CMD and ENTRYPOINT are following:

Combinations	No `ENTRYPOINT`	`ENTRYPOINT exec_entry p1_entry`	`ENTRYPOINT [“exec_entry”, “p1_entry”]`
No `CMD`	error, not allowed	`/bin/sh -c exec_entry p1_entry`	`exec_entry p1_entry`
`CMD [“exec_cmd”, “p1_cmd”]`	`exec_cmd p1_cmd`	`/bin/sh -c exec_entry p1_entry`	`exec_entry p1_entry exec_cmd p1_cmd`
`CMD [“p1_cmd”, “p2_cmd”]`	`p1_cmd p2_cmd`	/`bin/sh -c exec_entry p1_entry`	`exec_entry p1_entry p1_cmd p2_cmd`
`CMD exec_cmd p1_cmd`	`/bin/sh -c exec_cmd p1_cmd`	`/bin/sh -c exec_entry p1_entry`	`exec_entry p1_entry /bin/sh -c exec_cmd p1_cmd`

Note: If CMD is defined in base image, setting ENTRYPOINT will reset CMD to empty. So, CMD must be redefined in current image to have a value

Resource: Docker CMD Vs Entrypoint Commands: What's The Difference?

`VOLUME` instruction

This creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers.

The value of this instruction can be:
1. JSON array: VOLUME ["/var/log/"]
2. Plain string: VOLUME /var/log /var/db
Windows based containers must have volumes that are non-exiting or empty directory and a drive other than C:
If any build steps change the data within the volume after it has been declared, those changes will be discarded
The host directory is, by nature, host-dependent. This is to preserve image portability because a given host directory can't be guaranteed to be available on all hosts. That's why we can't mount host directory from within Dockerfile. We must specify the mountpoint when we create or run the container.
Resource: Use Volumes

`USER` instruction

This instruction sets the username or UID and optionally the user group (or GID) to use when running the image and for RUN, CMD, and ENTRYPOINT instructions.

Instruction format:
1. USER <user>[:<groups
2. USER <UID>[:<GID>]
If group/GID is specified, other group/GID will ignored
Default group is root
On Windows, the user must be created first with net user command

FROM microsoft/windowsservercore
# Create Windows user in the container
RUN net user /add patrick
# Set it for subsequent commands
USER patrick

`WORKDIR` instruction

This sets the working directory for any RUN, CMD, ENTRYPOINT, COPY, and ADD instructions. If it doesn't exist, it will be created even if it's not used in any subsequent Dockerfile instruction.

It can be declared multiple times. If relative path is provided, it will be relative to previous path provided
This can resolve environment variables

`ARG` instruction

This instruction defines a variable that users can pass at build-time to the builder with the docker build command with --build-arg <var>=<value> flag. A Dockerfile may include one or more instructions.

FROM busybox
ARG user1=someuser      # default value
ARG buildno             # without any value

The ARG variable is available fromt the line it is declared not from where it is used. An ARG goes out of scope at the end of the build stage where it is defined.
Useful interaction between ARG and ENV:

FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER=${CONT_IMG_VER:-v1.0.0}
RUN echo $CONT_IMG_VER`

Predefined ARGs: HTTP_PROXY, http_proxy, HTTPS_PROXY, https_proxy, FTP_PROXY, ftp_proxy, NO_PROXY, and no_proxy
Automatic Platform ARG (only available with BuildKit backend): These are not available inside build stage. To make these available add ARG <arg_name> to Dockerfile.
- TARGETPLATFORM: platform of the build result
- TARGETOS: OS component of TARGETPLATFORM
- TARGETARCH: architecture component of TARGETPLATFORM
- TARGETVARIANT: variant component of TARGETPLATFORM
- BUILDPLATFORM: platform of the node performing build
- BUILDOS: OS component of BUILDPLATFORM
- BUILDARCH: architecture component of BUILDPLATFORM
- BUILDVARIANT: variant component of BUILDPLATFORM

`ONBUILD` instruction

This adds a trigger instruction to the image, which would be executed at a later time, when the image is used as the base for another build. Any build instruction can be registered as a trigger. Format of the instruction: ONBUILD <instruction>. Example:

ONBUILD ADD . /app/src
ONBUILD RUN /usr/local/bin/python-build --dir /app/src

The way it works:

When encourted ONBUILD instruction, the builder adds a trigger to the metadata of the image being built
After the build, a list of all triggers are stored in image manifest under the key OnBuild and can be inspected with docker inspect command.
The image can be used as base for another build using the FROM instruction. As part of processing the FROM, the downstream builder looks for ONBUILD triggers and executes them in the same order they are registered.
Triggers are cleared from final image after being executed (not inherited by "grand-children" builds)

Note: Chaining ONBUILD with ONBUILD is not allowed: ONBUILD ONBUILD

Note: May not trigger FROM and MAINTAINER instructions

`STOPSIGNAL` instruction

This sets the system call signal that will be sent to the container to exit. This can be valid unsigned number that matches a position in the kernel's syscall table (ex: 9) or signal name (ex: SIGKILL).

`HEALTHCHECK` instruction

This tells docker how to test a container to check that it is still working. This can be used in cases like web server in infinite loop. When HEALTHCHECK is specified, it has a health status in addition to normal status. There can be only one HEALTHCHECK instruction in a Dockerfile. If multiple are mentioned, only the last one is taken into consideration. The health statuses can be:

starting initial stage
healthy when health check passes
unhealthy after a certain number of consecutive failures

The options that can appear before CMD are:

--interval=DURATION (default 30s) [frequency of running the checks]
--timeout=DURATION (default 30s) [how much time to wait before a check is considered failed]
--start-period=DURATION (default 0s) [initialization time for containers that need time to bootstrap]
--retries=N (default 3) [consicutive failures to consider the container unhealthy]

The command after the CMD keyword can be either a shell command (HEALTHCHECK CMD /bin/check-running) or an exec array (similar to ENTRYPOINT). Example

HEALTHCHECK --interval=5m --timeout=3s \
  CMD curl -f http://localhost/ || exit 1

To debug failing probes, all output of the command are stored in health status and can be queried with docker inspect command. When the health status changes, a health_status event is triggered with new status.

`SHELL` instruction

This instruction allows the default shell to be overridden. The default shell on Linux is ["/bin/sh", "-c"] and on Windows is ["cmd", "/S", "/C"]. This command is particularly important for Windows because a user can choose between: cmd, powershell and sometimes sh. This can appear multiple times and each time it overrides the previous ones. Example:

SHELL ["/bin/sh", "-c"]
SHELL ["cmd", "/S", "/C"]
SHELL ["powershell", "-command"]

External Implementation and BuildKit

Starting from 18.09, Docker supports a new backend for building images, which is provided by moby/buildkit. There are some additional features that comes with this BuildKit. The documentation for this can be found here.

Reference

Official Dockerfile reference - https://docs.docker.com/engine/reference/builder/
BuildKit - https://github.com/moby/buildkit

Blog

Dockerfile reference summary

Md. Hussainul Islam Sajib

Dockerfile Reference

Purpose of Dockerfile

Format of Instructions

Environment Replacement

`.dockerignore` file

`FROM` Instruction

`ARG` and `FROM` working together

`RUN` instruction

`CMD` instruction

`LABEL` instruction

`EXPOSE` instruction

`ENV` instuction

`ADD` instruction

`COPY` instruction

`ENTRYPOINT` instruction

How `CMD` and `ENTRYPOINT` interact

`VOLUME` instruction

`USER` instruction

`WORKDIR` instruction

`ARG` instruction

`ONBUILD` instruction

`STOPSIGNAL` instruction

`HEALTHCHECK` instruction

`SHELL` instruction

External Implementation and BuildKit

Reference

Join Our Newsletter. No Spam, Only the good stuff.

Related

Dockerfile reference summary

Md. Hussainul Islam Sajib

Dockerfile Reference

Purpose of Dockerfile

Format of Instructions

Environment Replacement

.dockerignore file

FROM Instruction

ARG and FROM working together

RUN instruction

CMD instruction

LABEL instruction

EXPOSE instruction

ENV instuction

ADD instruction

COPY instruction

ENTRYPOINT instruction

How CMD and ENTRYPOINT interact

VOLUME instruction

USER instruction

WORKDIR instruction

ARG instruction

ONBUILD instruction

STOPSIGNAL instruction

HEALTHCHECK instruction

SHELL instruction

External Implementation and BuildKit

Reference

Join Our Newsletter. No Spam, Only the good stuff.

Related

`.dockerignore` file

`FROM` Instruction

`ARG` and `FROM` working together

`RUN` instruction

`CMD` instruction

`LABEL` instruction

`EXPOSE` instruction

`ENV` instuction

`ADD` instruction

`COPY` instruction

`ENTRYPOINT` instruction

How `CMD` and `ENTRYPOINT` interact

`VOLUME` instruction

`USER` instruction

`WORKDIR` instruction

`ARG` instruction

`ONBUILD` instruction

`STOPSIGNAL` instruction

`HEALTHCHECK` instruction

`SHELL` instruction