Dockerize CUDA-Accelerated Applications

afifaniks

Afif Al Mamun

Posted on March 3, 2023

Dockerize CUDA-Accelerated Applications

Before we start

This guide expects the reader is already familiar with docker, PyTorch, CUDA, etc., and will not explain how and why things work instead it will describe how to get particular things done.

Abstract

Dockerizing applications has become a norm in the software industry for a while now. Everything nowadays is a container and almost every developer knows how to build containers! However, if your application requires GPU (i.e. AI/ML applications) acceleration, containerizing the application becomes slightly different. You have to make sure your docker container is enabled to harness the power of the CUDA cores in your machine. In this post, we will see how to do that.

Prerequisites

You have docker installed. Confirm this by executing the command and observing a similar output:

$ docker -v
Docker version 20.10.21, build baeda1f
Enter fullscreen mode Exit fullscreen mode

You have NVIDIA GPU drivers installed and set up properly in your system. You can ensure this by:

$ nvidia-smi
Wed Feb 22 12:55:05 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.65       Driver Version: 527.56       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   39C    P8     9W /  30W |      0MiB /  4096MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A        21      G   /Xwayland                       N/A      |
|    0   N/A  N/A        34      G   /Xwayland                       N/A      |
|    0   N/A  N/A        45      G   /Xwayland                       N/A      |
+-----------------------------------------------------------------------------+
Enter fullscreen mode Exit fullscreen mode

If any of the above steps do not provide you with the expected output, you must stop here and install the required drivers to proceed to the further sections of this post.

You also need to make sure which CUDA version your system supports. For instance, nvidia-smi says my CUDA Version is 12.0. Therefore, I can use CUDA container images up to version 12.0. You should use whatever your targetted system supports and meets your requirements.

Example

Let’s assume we have a simple project that uses PyTorch and CUDA which we want to dockerize. Our app is very straightforward and the project tree is as follows -

app
├── main.py
├── requirements.txt
└── Dockerfile
Enter fullscreen mode Exit fullscreen mode

We will be going through each of the files for a better illustration.

main.py

import torch

if torch.cuda.is_available():
    print(f"Using CUDA. Version: {torch.version.cuda}")
else:
    printf("CUDA is not available")
Enter fullscreen mode Exit fullscreen mode

The main script is as simple as importing the torchmodule and checking if it can use CUDA.

requirements.txt

In this test app, we are going to use a specific version of torch 1.12.1 with CUDA 11.3. Hence, the requirements.txt file -

torch==1.12.1+cu113
Enter fullscreen mode Exit fullscreen mode

You can pick any version as per your requirements from: https://pytorch.org/get-started/previous-versions/

Dockerfile

Having all the previous steps marked done, your system is now ready to write a Dockerfile. In your project directory add a new Dockerfile:

FROM nvidia/cuda:11.3.0-base-ubuntu20.04

RUN apt-get update &&  \
    apt-get install --no-install-recommends -y python3-pip python3-dev ffmpeg libsm6 libxext6 gcc g++ && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app

COPY requirements.txt ./

RUN pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu113 --verbose

COPY . ./

ENTRYPOINT ["python3", "main.py"]
Enter fullscreen mode Exit fullscreen mode

Explanation

In the first layer, we are using an official image of CUDA by NVIDIA based on Ubuntu 20.04. This image will be automatically communicating with your machine’s GPU driver and should be able to provide your application with the capability of using GPU acceleration in it. Note that, we are using an image based on CUDA 11.3 similar to the torch version we used in our requirements file. You should be careful about selecting/determining the version you want to use as it’s a good idea to keep it the same across the application.

In the second layer, we installed a few recommended packages, not all of these packages are necessary for this test application but they are often needed in a production app.

Then, we set the working directory and copied the requirements.txt file to it. We could copy the whole project at this stage but it is a good practice to copy the requirements file first and install the requirements as we don’t want to re-install all those packages every time we change our source codes. It’s less likely that a requirements file will change more frequently in the development lifecycle than the source codes. The cached layers will save us a lot of time while rebuilding the image in the future.

In the consecutive layers, we install the requirements, copy source codes, and run the application.

Build Image & Run Container

In the final step, we build the docker image and run the image through a container. To build an image, we use the following command in the working directory -

$ docker build --rm -t image_name .
Enter fullscreen mode Exit fullscreen mode

After the image is built, we can run the app through a container -

$ docker run --gpus all image_name
Using CUDA. Version: 11.3
Enter fullscreen mode Exit fullscreen mode

This will expose all available GPUs to the container and let it use them. If everything goes right, you should be able to see console output on line 2.

In case, if you have multiple GPUs and want to allow the specific device(s) to be used by your container, in that case, you can use the device parameter.

$ docker run --gpus device={DEVICE_ID} image_name
Enter fullscreen mode Exit fullscreen mode

The above command will expose the specific GPU device to your container. If you want to allow multiple GPUs to be accessed by a container, you can use the following. For instance, this will expose the first and second available GPUs to the container.

$ docker run --gpus '"device=0,1"' image_name
Enter fullscreen mode Exit fullscreen mode




References

💖 💪 🙅 🚩
afifaniks
Afif Al Mamun

Posted on March 3, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related