Docker Compose volumes, volume-only projects and init containers

Introduction

We had a question on the official Docker forum about a feature that was supported in Docker Compose v1 but not in v2. This made me wonder about two things. "Why would I need this?" and also "Would I be able to find a workaround for Docker Compose v2?". And I did, so let's see how you can create one compose file for Compose v1 to create only a volume without actually running services, and also how you can do it with Compose v2.

We will also learn some related tricks about volumes, possible issues and how we can initialize volumes with data.

Should we even create a volume-only Compose project?
Volume-only Compose project with Compose v1
Volume-only Compose project with Compose v2
What's the problem with permissions?
Create a Dockerfile for the examples
Mounting a volume at a location which does not exist in the container
Mounting the volume over an existing folder
Multiple containers setting different owners
Volume init containers
Referring to external volumes in a Compose file
Data saved in an image
Conclusion

Should we even create a volume-only Compose project?

» Back table of content «

My first thought was "no". Why would I need that? I want to run a container with a volume, so somewhere I will need to create the containers. Why would I do it in another compose project? I still don't think I would do that, since I could just write a shell script to create an external volume which I can use in any project.

#!/usr/bin/env bash

docker volume create common_data

Or actually, I can just use a new volume name and Docker will automatically create it for me. At least if I want to create a default local volume without parameters. It wouldn't work with an NFS volume for example.

Depending on your needs you can add more parameters, but the point is that a simple shell script is usually enough. The problem is it requires bash or at least any Linux/macOS shell. On Windows you would need a Batch script or a PowerShell script, so doing it with a platform-independent solution could actually be a good idea.

It still doesn't explain why we would do it in the first place, only why we would do it with Docker Compose.

I called the volume "common_data". I can't think of an obvious example now, but let's say you need a common data store for multiple projects. Normally, In case of local volumes, I would create a folder on the host, set proper permissions and bind mount the folder to multiple containers. That is easy on Linux, but again, Docker is supported on macOS and Windows. On Windows, since that is not a Unix-like operating system, mounting a folder from the host is a little bit more tricky. Sometimes you just want to place your files on a Linux filesystem for performance reasons when you are using Linux containers. Creating a named local volume will do exactly that, although you need to copy the data to the volume. You could have other reasons to create a volume-only Docker Compose project, so let's see some example how you would do it in the following chapters. Instead of naming the volume "common_data", I will use the "volume_only_" prefix, but the concept is the same.

Volume-only Compose project with Compose v1

» Back table of content «

Compose v1 is not updated anymore and everyone should use Compose v2, but let's see how volume-only compose projects worked with Compose v1. All you had to do is create a file called "docker-compose.yml" containing only the volume definition and run docker-compose up -d. Of course, -d is optional and I will usually not use it, since I won't have many services that have to run in the background. On the other hand, I will set the project name and the compose file path, so I can have a single folder for all of my examples.

Let's create "compose.01.yml" with the following content:

volumes:
  data:

Then start the project:

docker-compose --project-name volume_only_01 -f compose.01.yml up

Note: In current Docker Desktop (v4.22.1) on macOS I used docker-compose-v1, but the default is v2. On Linux, docker-compose usually means you are using Compose v1. If you are not sure, check the compose version: docker-compose version.

The output is:

Creating volume "volume_only_01_data" with default driver
Attaching to

Now you can check the volume:

docker volume ls --filter name=volume_only_01_data

Volume-only Compose project with Compose v2

» Back table of content «

Docker Compose v2 will not create anything if the compose file has no service definition, so you need a fake service which does nothing. You don't need any real image either, so let's build a fake empty image too. Save the following file as "compose.02.yml"

volumes:
  data:

services:
  fake:
    network_mode: none
    build:
      dockerfile_inline: "FROM scratch"
    command: fake
    volumes:
      - data:/fake

This single yaml file will do the trick. First we define the volume, then create the services section, which can't be empty. Using

    network_mode: none

we can make sure docker will not create a network bridge for our fake service. Then we build an empty image:

    build:
      dockerfile_inline: "FROM scratch"

Since we used "FROM scratch", Docker will not pull any image, but instantly create an alias for "scratch" which would not be a valid image name otherwise, only in a Dockerfile. Obviously this image would never work, but we still need to define a command, otherwise Compose would fail to create the project because there is no command definition in the image or in the Compose file. Since it will never run, the command can be anything. I used "fake".

    command: fake

We also need to define the mount point of the volume, even if the container never runs, because volumes will be created only if at least one service refers to it.

    volumes:
      - data:/fake

Again, doesn't matter what the mount point is if it is not an existing and non-empty folder, so I used "/fake".

Now you need to create the project, but the usual compose command would not work, because docker compose up runs containers by default, and we only have a fake container that could never run. Fortunately we can use the --no-start option.

docker compose --project-name volume_only_02 -f compose.02.yml up --no-start

What's the problem with permissions?

» Back table of content «

One of the most important features of a volume is that Docker can automatically set the permissions on it to match the original folder's permissions if the mount point in the container is an existing directory and the volume is still empty.

If the mount point is not just existing but also a non-empty directory, Docker will copy the files to the volume before mounting it over the existing folder. If you don't have a service and the volume is just created or have a fake service and mount the volume at a new location, not over an existing directory, nothing will be changed on the volume.

So if you want a common volume to be used by multiple containers in different compose projects, this is something you need to know, because it can lead to the following situation.

Depending on where you mount the volume in different containers, each container can override the permissions on the same volume, until one of the containers write data to the volume, so it is not empty anymore. The problem is that if the process in the container is not running as root, and by the time that process can start writing to the volume another container changes the owner and makes it writable only by the owner, then the process will not be able to write it. Unless of course the process is running as the same user.

In case of multiple containers in a single compose project, you could define one container as a dependency of another, but that wouldn't help either, since all containers are created before the first container starts. Even if it could help, that's not an option now, since we want a volume-only compose project and mounting the volume in multiple independent compose projects.

Create a Dockerfile for the examples

» Back table of content «

For easier testing, I created a Dockerfile to build an image which can help us to test mounting different volumes and how changing ownership can affect whether our container work or not.

Create a file called "Dockerfile" with the below content:

FROM bash:5.2

# /mnt/data_auto_created will be automatically created when we use it as a mount point

RUN mkdir /mnt/data_empty_1 \
 && chown 1:1 /mnt/data_empty_1

RUN mkdir /mnt/data_empty_2 \
 && chown 2:2 /mnt/data_empty_2

RUN mkdir /mnt/data_non_empty \
 && chown 3:3 /mnt/data_non_empty \
 && echo "Volume test" > /mnt/data_non_empty/index.html \
 && chown 4:4 /mnt/data_non_empty/index.html

As the comment indicates, we will also test what happens when we mount a volume at a location which doesn't exist in the container.

Build the image:

docker build . -t localhost/volume_only

Mounting a volume at a location which does not exist in the container

» Back table of content «

Let's mount a volume at /mnt/data_auto_created in a test container. That folder does not exist in the container, so even if I change the ownership of the volume, Docker will not automatically "fix" it later, which is important for the rest of the examples.

docker run --rm -v volume_only_02_data:/mnt/data_auto_created bash:5.2 ls -la /mnt/data_auto_created

Output:

total 8
drwxr-xr-x    2 root     root          4096 Sep 17 08:43 .
drwxr-xr-x    1 root     root          4096 Sep 17 11:07 ..

Change the owner to UID 1 and GID 1:

docker run --rm -v volume_only_02_data:/mnt/data_auto_created bash:5.2 chown 1:1 /mnt/data_auto_created

Check the ownership again:

docker run --rm -v volume_only_02_data:/mnt/data_auto_created bash:5.2 ls -la /mnt/data_auto_created

Output:

total 8
drwxr-xr-x    2 bin      bin           4096 Sep 17 08:43 .
drwxr-xr-x    1 root     root          4096 Sep 17 11:14 ..

Mounting the volume over an existing folder

» Back table of content «

If I mount the volume over an existing folder, watch what happens. The "/opt" folder is empty in the bash container and owned by root, so let's use that for testing:

docker run --rm -v volume_only_02_data:/opt bash:5.2 ls -la /opt

Output:

total 8
drwxr-xr-x    2 root     root          4096 Aug  7 13:12 .
drwxr-xr-x    1 root     root          4096 Sep 17 11:20 ..

The ownership was changed again only because I mounted the empty volume over an existing folder.

Multiple containers setting different owners

» Back table of content «

In the next example we will see what happens when a container changes the owner before another container could write data on it. Save the following code in "compose.03.yml"

services:

  empty_1:
    image: localhost/volume_only
    user: 1:1
    command:
      - bash
      - -c
      - |
        set -eu -o pipefail

        echo "Running as $(whoami):$(getent group $(id -g) | cut -d: -f1) ($(id -u):$(id -g))"

        echo "Checking owner of /mnt/data_empty_1: "
        ls -ld /mnt/data_empty_1
        ls -ldn /mnt/data_empty_1

        echo "ready" > /mnt/data_empty_1/status
        echo "Status is successfully written to the filesystem"

  empty_2:
    image: localhost/volume_only
    depends_on:
      - empty_1
    user: 2:2
    command:
      - bash
      - -c
      - |
        echo "Running as $(whoami):$(getent group $(id -g) | cut -d: -f1) ($(id -u):$(id -g))"
        echo "Checking owner of /mnt/data_empty_2: "
        ls -ld /mnt/data_empty_2
        ls -ldn /mnt/data_empty_2

The above services will show how the containers would run without volumes.

docker compose --project-name volume_only_03 -f compose.03.yml up

Output:

[+] Running 3/3
 ✔ Network volume_only_03_default      Created                                                           0.1s
 ✔ Container volume_only_03-empty_1-1  Created                                                           0.0s
 ✔ Container volume_only_03-empty_2-1  Created                                                           0.0s
Attaching to volume_only_03-empty_1-1, volume_only_03-empty_2-1
volume_only_03-empty_1-1  | Running as bin:bin (1:1)
volume_only_03-empty_1-1  | Checking owner of /mnt/data_empty_1:
volume_only_03-empty_1-1  | drwxr-xr-x    2 bin      bin           4096 Sep 17 11:05 /mnt/data_empty_1
volume_only_03-empty_1-1  | drwxr-xr-x    2 1        1             4096 Sep 17 11:05 /mnt/data_empty_1
volume_only_03-empty_1-1  | Status is successfully written to the filesystem
volume_only_03-empty_1-1 exited with code 0
volume_only_03-empty_2-1  | Running as daemon:daemon (2:2)
volume_only_03-empty_2-1  | Checking owner of /mnt/data_empty_2:
volume_only_03-empty_2-1  | drwxr-xr-x    2 daemon   daemon        4096 Sep 17 11:05 /mnt/data_empty_2
volume_only_03-empty_2-1  | drwxr-xr-x    2 2        2             4096 Sep 17 11:05 /mnt/data_empty_2
volume_only_03-empty_2-1 exited with code 0

As you can see, each container's data folder is owned by the user that runs the process in the container. Now let's use a common volume in the file called "compose.04.yml":

volumes:
  data:

services:

  empty_1:
    image: localhost/volume_only
    user: 1:1
    volumes:
      - data:/mnt/data_empty_1
    command:
      - bash
      - -c
      - |
        set -eu -o pipefail

        echo "Running as $(whoami):$(getent group $(id -g) | cut -d: -f1) ($(id -u):$(id -g))"

        echo "Checking owner of /mnt/data_empty_1: "
        ls -ld /mnt/data_empty_1
        ls -ldn /mnt/data_empty_1

        echo "ready" > /mnt/data_empty_1/status
        echo "Status is successfully written to the filesystem"

  empty_2:
    image: localhost/volume_only
    depends_on:
      - empty_1
    user: 2:2
    volumes:
      - data:/mnt/data_empty_2
    command:
      - bash
      - -c
      - |
        echo "Running as $(whoami):$(getent group $(id -g) | cut -d: -f1) ($(id -u):$(id -g))"
        echo "Checking owner of /mnt/data_empty_2: "
        ls -ld /mnt/data_empty_2
        ls -ldn /mnt/data_empty_2

Run compose:

docker compose --project-name volume_only_04 -f compose.04.yml up

The output:

Attaching to volume_only_04-empty_1-1, volume_only_04-empty_2-1
volume_only_04-empty_1-1  | Running as bin:bin (1:1)
volume_only_04-empty_1-1  | Checking owner of /mnt/data_empty_1:
volume_only_04-empty_1-1  | drwxr-xr-x    2 daemon   daemon        4096 Sep 17 11:05 /mnt/data_empty_1
volume_only_04-empty_1-1  | drwxr-xr-x    2 2        2             4096 Sep 17 11:05 /mnt/data_empty_1
volume_only_04-empty_1-1  | bash: line 9: /mnt/data_empty_1/status: Permission denied
volume_only_04-empty_1-1 exited with code 1
volume_only_04-empty_2-1  | Running as daemon:daemon (2:2)
volume_only_04-empty_2-1  | Checking owner of /mnt/data_empty_2:
volume_only_04-empty_2-1  | drwxr-xr-x    2 daemon   daemon        4096 Sep 17 11:05 /mnt/data_empty_2
volume_only_04-empty_2-1  | drwxr-xr-x    2 2        2             4096 Sep 17 11:05 /mnt/data_empty_2
volume_only_04-empty_2-1 exited with code 0

So even though the second container started later, all containers were created before starting the first.

This way the second container's volume mounting changed the owner even before the shell script could start in the first container, and it didn't have permission to write its status to the filesystem. When you are running two separate compose projects to avoid creating all the containers too early, the initialization can take time, so the owner could be changed before writing to the filesystem.

Volume init containers

» Back table of content «

The only way to avoid containers changing volumes is writing data on the volume before any container mounts it, except the one container that initializes the volume. Let's use the following compose file named as "compose.05.yml".

volumes:
  data_empty_1:
  data_empty_2:
  data_non_empty:

services:

  init_1:
    image: localhost/volume_only
    volumes:
      - data_empty_1:/mnt/data_empty_1
    command:
      - bash
      - -c
      - |
        touch /mnt/data_empty_1/.placeholder

  init_2:
    image: localhost/volume_only
    volumes:
      - data_empty_2:/mnt/data_empty_2
    command:
      - bash
      - -c
      - |
        mkdir /mnt/data_empty_2/files

  init_3:
    image: localhost/volume_only
    volumes:
      - data_non_empty:/mnt/data_non_empty

The above example demonstrates three ways to initialize a volume with content for other containers to write them, but no container can change the volumes by just mounting them, although the processes in the container can write data on the volumes after the containers started.

In the first container we create a placeholder file which can be removed later by other containers. It can be handled easily if you are the author of the image. If not, you may need to create a new entrypoint which removes the file if necessary, however, unless the application in the container reads all the files on the volume without filtering to specific file types or folders, it will not matter.
If the placeholder file is not a solution, but you can configure the application to use a custom folder, you can create a new folder on the volume. This is what the second init container does.
The third init container will mount a non-empty folder, so we don't have to create anything on the volume.

Let's start the project and create the volumes:

docker compose --project-name volume_only_05 -f compose.05.yml up

Now all you have to make sure is starting the volume init project before any other container that has to use these volumes.

Referring to external volumes in a Compose file

» Back table of content «

Now let's create a compose project that can use these volumes. For the sake of simplicity (if it can be called simple...) I will use one project for multiple containers again. Save the following yaml as "compose.06.yml".

volumes:
  data_1:
    external: true
    name: volume_only_05_data_empty_1
  data_2:
    external: true
    name: volume_only_05_data_empty_2
  data_3:
    external: true
    name: volume_only_05_data_non_empty

x-common-params: &common-params
  image: httpd:2.4
  ports:
    - 80

services:

  web_1:
    <<: *common-params
    volumes:
      - data_1:/usr/local/apache2/htdocs

  web_2:
    <<: *common-params
    volumes:
      - data_2:/usr/local/apache2/htdocs

  web_3:
    <<: *common-params
    volumes:
      - data_3:/usr/local/apache2/htdocs

We needed to define the volumes as external, so we can use the volumes created by another project. I also used a special yaml syntax for defining common parameters. It is just a bonus, because it is useful.

x-common-params: &common-params
  image: httpd:2.4
  ports:
    - 80

I defined a single port instead of port mapping so Docker will automatically forward free ports from the host to the containers. This way I don't have to know which port you are already using. I can refer to this yaml block from the services, and it will be merged with the rest of the parameters.

  web_1:
    <<: *common-params
    volumes:
      - data_1:/usr/local/apache2/htdocs

Start the project:

docker compose --project-name volume_only_06 -f compose.06.yml up -d

Yes, this is the only example where I needed to run the containers in detached mode since they will run webservers.

As the host ports are automatically generated, the following bash script will use curl to access the websites and show the contents of the volumes. Let's name the script file as "curl-helper.sh":

#!/usr/bin/env bash


services="$(docker compose -p volume_only_06 -f compose.06.yml ps --services)"

for service in $services; do
  container_id="$(docker compose --project-name volume_only_06 -f compose.06.yml ps --quiet "$service")"
  port="$(docker container inspect "$container_id" --format '{{ (index (index .NetworkSettings.Ports
"80/tcp") 0).HostPort }}')"

  echo -e "\n\n$service\n\n";
  curl localhost:$port
done

Make it executable:

chmod +x curl-helper.sh

And run it:

./curl-helper.sh

The output:

web_1


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
 <head>
  <title>Index of /</title>
 </head>
 <body>
<h1>Index of /</h1>
<ul><li><a href=".placeholder"> .placeholder</a></li>
</ul>
</body></html>


web_2


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
 <head>
  <title>Index of /</title>
 </head>
 <body>
<h1>Index of /</h1>
<ul><li><a href="files/"> files/</a></li>
</ul>
</body></html>


web_3


Volume test

So as you can see, the first container has the placeholder file, the second has the custom folder and the third has an actual website. Of course, volumes are usually not for websites but data, but it was an easy way to generate files that we can check from command line.

Data saved in an image

» Back table of content «

I added this chapter before publishing this post because I think it's worth mentioning, however it is usually not a good idea.

For a demo application you may want to save pre-generated data in an image, so it can be immediately show how the application would look like after using it for a while. If you do that, and you delete all data from the admin webinterface, whe you recreate the container mounting the same volume, or another container mounts it, the original data will be copied to the volume again.

For a demo application it can be useful, but in a production environment it would be most likely a bug, so make sure you don't do it.

Conclusion

» Back table of content «

Usually we don't create volume-only compose projects and since it became even harder in Compose v2, I wouldn't recommend it in most cases. If you feel that in your case it makes sense, you can do it also with Compose v2. Make sure you set the permissions properly and start your compose projects and containers in the right order.

In what cases you think volume-only compose projects can be useful? Please, share your ideas in the comment section below.