The Ollama Docker Compose Setup with WebUI and Remote Access via Cloudflare

ajeetraina

Ajeet Singh Raina

Posted on May 26, 2024

The Ollama Docker Compose Setup with WebUI and Remote Access via Cloudflare

Want to run powerful AI models locally and access them remotely through a user-friendly interface? This guide explores a seamless Docker Compose setup that combines Ollama, Ollama UI, and Cloudflare for a secure and accessible experience.

Prerequisites:

  • Supported NVIDIA GPU (for efficient model inference)
  • NVIDIA Container Toolkit (to manage GPU resources)
  • Docker Compose (to orchestrate containerized services)

Understanding the Services:

  • webui (ghcr.io/open-webui/open-webui:main): This acts as the web interface, allowing you to interact with your Ollama AI models visually.
  • ollama (Optional - ollama/ollama): This is the AI model server itself. It can leverage your NVIDIA GPU for faster inference tasks.
  • tunnel (cloudflare/cloudflared:latest): This service establishes a secure tunnel to your web UI via Cloudflare, enabling safe remote access.

Volumes and Environment Variables:

  • Two volumes, ollama and open-webui, are defined to store data persistently across container restarts. This ensures your models and configurations remain intact.
  • The crucial environment variable is OLLAMA_API_BASE_URL. Make sure it points to the correct internal network URL of the ollama service. If ollama runs directly on your Docker host, you can use host.docker.internal as the address.

Deployment and Access:

  • Deployment: Execute docker compose up -d to start all services in detached mode, running them in the background.
  • Local Access: If you just need to access the web UI locally, simply navigate to http://localhost:8080 in your web browser.
  • Remote Access: To access your AI models remotely, locate the Cloudflare Tunnel URL printed in the Docker logs. Use docker compose logs tunnel to retrieve this URL. Now, you can access your models from anywhere with an internet connection, provided you have the URL.

Benefits:

  • Simplified AI Model Management: Easily interact with your AI models through the user-friendly Ollama UI.
  • Remote Accessibility: Securely access your models from any location with a web browser thanks to Cloudflare's tunneling capabilities.
  • GPU Acceleration (Optional): Leverage your NVIDIA GPU for faster model inference, speeding up tasks.

Getting Started

  • Install Docker


curl -sSL https://get.docker.com/ | sh


Enter fullscreen mode Exit fullscreen mode

Writing a Docker Compose file



services:

  webui:
    image: ghcr.io/open-webui/open-webui:main
    expose:
     - 8080/tcp
    ports:
     - 8080:8080/tcp
    environment:
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
    volumes:
      - open-webui:/app/backend/data
    depends_on:
     - ollama

  ollama:
    image: ollama/ollama
    expose:
     - 11434/tcp
    ports:
     - 11434:11434/tcp
    healthcheck:
      test: ollama --version || exit 1
    command: serve
    volumes:
      - ollama:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['all']
              capabilities: [gpu]

  tunnel:
    image: cloudflare/cloudflared:latest
    restart: unless-stopped
    environment:
      - TUNNEL_URL=http://webui:8080
    command: tunnel --no-autoupdate
    depends_on:
      - webui

volumes:
  ollama:
  open-webui:


Enter fullscreen mode Exit fullscreen mode

The Compose file defines the individual services that make up the entire application. Here, we have three services:

  • webui,
  • ollama ,
  • and tunnel.

The webui service acts as your user interface for interacting with Ollama AI models. It fetches data from the optional ollama service (the AI model server) running on the same network, and lets you manage and use your models visually. You can access the web interface at http://localhost:8080 if running locally. The ollama service itself (optional) handles running your models, and can leverage your NVIDIA GPU for faster computations. Finally, the tunnel service provides a secure way to access the web interface remotely through Cloudflare.

Bringing up the Stack



docker compose up -d


Enter fullscreen mode Exit fullscreen mode

You will see the following services:



docker compose ps
NAME                  IMAGE                                COMMAND                  SERVICE   CREATED              STATUS                        PORTS
cloudflare-ollama-1   ollama/ollama                        "/bin/ollama serve"      ollama    About a minute ago   Up About a minute (healthy)   0.0.0.0:11434->11434/tcp
cloudflare-tunnel-1   cloudflare/cloudflared:latest        "cloudflared --no-au…"   tunnel    About a minute ago   Up About a minute
cloudflare-webui-1    ghcr.io/open-webui/open-webui:main   "bash start.sh"          webui     About a minute ago   Up About a minute             0.0.0.0:8080->8080/tcp


Enter fullscreen mode Exit fullscreen mode

Image3

Conclusion

This setup empowers you to unlock the potential of your AI models both locally and remotely. With Ollama, Ollama UI, and Cloudflare working in tandem, you gain a powerful and accessible platform for exploring and utilizing AI technology.

Read More

💖 💪 🙅 🚩
ajeetraina
Ajeet Singh Raina

Posted on May 26, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related