Jupyter Notebooks in Docker

hassan_aftab

Hassan Aftab

Posted on November 29, 2024

Jupyter Notebooks in Docker

Why use Docker for Jupyter Notebooks?

Docker provides an efficient and reproducible environment for running Jupyter Notebooks. With Docker, you can create isolated containers that ensure the dependencies and configurations for your Jupyter Notebooks are consistent across different systems. This is particularly useful for data science projects, where managing dependencies and ensuring reproducibility is crucial.

In this article, we will cover how to:

  1. Set up Docker for Jupyter Notebooks.
  2. Use pre-built Jupyter Docker images.
  3. Customize your Jupyter Notebook environment with a custom Dockerfile.
  4. Manage persistent data storage for notebooks.

1. Setting up Docker

Install Docker

Before using Docker, ensure it is installed on your system:

Verify Docker installation:

docker --version
Enter fullscreen mode Exit fullscreen mode

Verify Docker Service

Ensure the Docker service is running:

sudo systemctl start docker  # Linux
Enter fullscreen mode Exit fullscreen mode

2. Using Pre-Built Jupyter Docker Images

The Jupyter Project provides pre-built Docker images with various configurations. These images come with pre-installed packages tailored for data science, machine learning, and scientific computing.

Available Images

Some popular Jupyter Docker images:

  • jupyter/base-notebook:
    • Minimal Jupyter Notebook environment.
  • jupyter/scipy-notebook:
    • Includes scientific computing libraries like NumPy, SciPy, and pandas.
  • jupyter/tensorflow-notebook:
    • Includes TensorFlow and Keras for machine learning.
  • jupyter/r-notebook:
    • Supports R in Jupyter.

Running a Jupyter Notebook Container

To start a Jupyter Notebook using the scipy-notebook image:

docker run -p 8888:8888 jupyter/scipy-notebook
Enter fullscreen mode Exit fullscreen mode

Once the container starts, you will see a URL with a token in the logs, such as:

http://127.0.0.1:8888/?token=<token>
Enter fullscreen mode Exit fullscreen mode

Copy this URL into your browser to access the Jupyter Notebook interface.

Running Jupyter with a Named Volume

To persist your work:

docker run -p 8888:8888 -v $(pwd):/home/jovyan/work jupyter/scipy-notebook
Enter fullscreen mode Exit fullscreen mode

This command mounts your current directory to the container's work directory, ensuring changes are saved locally.


3. Customizing Your Jupyter Environment

If the pre-built images don’t meet your needs, you can create a custom Docker image with your own dependencies and configurations.

Creating a Custom Dockerfile

  1. Create a Dockerfile:
   FROM jupyter/scipy-notebook

   # Install additional Python packages
   RUN pip install matplotlib seaborn

   # Set a custom working directory
   WORKDIR /home/jovyan/my-project
Enter fullscreen mode Exit fullscreen mode
  1. Build the Docker image:
   docker build -t my-custom-jupyter .
Enter fullscreen mode Exit fullscreen mode
  1. Run the container:
   docker run -p 8888:8888 -v $(pwd):/home/jovyan/work my-custom-jupyter
Enter fullscreen mode Exit fullscreen mode

Adding Conda Environments

To include a Conda environment, modify the Dockerfile:

FROM jupyter/scipy-notebook

# Create and activate a Conda environment
RUN conda create -n myenv python=3.9 && \
    echo "source activate myenv" > ~/.bashrc

# Install packages in the new environment
RUN conda install -n myenv pandas matplotlib
Enter fullscreen mode Exit fullscreen mode

4. Persistent Data Storage

By default, any data or notebooks created inside a Docker container are lost when the container stops. To avoid this, you can use Docker volumes or bind mounts.

Using Docker Volumes

Volumes are managed by Docker and provide a way to persist data independently of the container lifecycle:

docker volume create jupyter-data

docker run -p 8888:8888 -v jupyter-data:/home/jovyan/work jupyter/scipy-notebook
Enter fullscreen mode Exit fullscreen mode

The jupyter-data volume will persist your notebooks and files.

Using Bind Mounts

Bind mounts map a local directory to a directory inside the container:

docker run -p 8888:8888 -v $(pwd):/home/jovyan/work jupyter/scipy-notebook
Enter fullscreen mode Exit fullscreen mode

This maps the current directory ($(pwd)) to /home/jovyan/work in the container.


5. Advanced Usage

Networking

To allow multiple users to access your Jupyter Notebook over a network:

docker run -p 8888:8888 --ip=0.0.0.0 jupyter/scipy-notebook
Enter fullscreen mode Exit fullscreen mode

Share the URL with the token for access.

Using Docker Compose

For more complex setups, use Docker Compose to manage multiple services (e.g., Jupyter + a database):

version: '3'
services:
  jupyter:
    image: jupyter/scipy-notebook
    ports:
      - "8888:8888"
    volumes:
      - ./notebooks:/home/jovyan/work
  postgres:
    image: postgres
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
Enter fullscreen mode Exit fullscreen mode

Start the services:

docker-compose up
Enter fullscreen mode Exit fullscreen mode

Conclusion

Docker is an excellent tool for running and managing Jupyter Notebooks in a reproducible and isolated environment. Whether you use pre-built images or create custom ones, Docker simplifies dependency management and ensures consistent environments for your projects. By leveraging persistent storage and tools like Docker Compose, you can scale your Jupyter Notebook workflows to handle complex, multi-container setups.

💖 💪 🙅 🚩
hassan_aftab
Hassan Aftab

Posted on November 29, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Jupyter Notebooks in Docker
datascience Jupyter Notebooks in Docker

November 29, 2024