OLLAMA with AMD GPU (ROCm)

kokizzu

Kiswono Prayogo

Posted on February 27, 2024

OLLAMA with AMD GPU (ROCm)

Today we're gonna test ollama (just like previous article) with AMD GPU, to do this you'll need to run docker, for example using this docker compose file:

version: "3.7"

services:
  ollama:
    container_name: ollama
    image: ollama/ollama:0.1.22-rocm
    environment:
      HSA_OVERRIDE_GFX_VERSION: 10.3.0 # only if you are using 6600XT
    volumes:
      - /usr/share/ollama/.ollama:/root/.ollama # reuse existing model
      #- ./etc__resolv.conf:/etc/resolv.conf # if your dns sucks
    devices:
      - /dev/dri
      - /dev/kfd
    restart: unless-stopped

  ollama-webui:
    image: ghcr.io/ollama-webui/ollama-webui:main
    container_name: ollama-webui
    ports:
      - "3122:8080"
    volumes:
      - ./ollama-webui:/app/backend/data
    environment:
      - 'OLLAMA_API_BASE_URL=http://ollama:11434/api'
    restart: unless-stopped
Enter fullscreen mode Exit fullscreen mode

To run it you just need to execute

docker exec -it `docker ps | grep ollama/ollama | cut -f 1 -d ' '` bash
Enter fullscreen mode Exit fullscreen mode

Make sure that you have already have ROCm installed

$ dpkg -l | grep rocm | cut -d ' ' -f 3
rocm-cmake
rocm-core
rocm-device-libs
rocm-hip-libraries
rocm-hip-runtime
rocm-hip-runtime-dev
rocm-hip-sdk
rocm-language-runtime
rocm-llvm
rocm-ocl-icd
rocm-opencl
rocm-opencl-runtime
rocm-smi-lib
rocminfo
$ cat /etc/apt/sources.list.d/* | grep -i 'rocm'
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/amdgpu/6.0/ubuntu jammy main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.0 jammy main
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.0 jammy main
Enter fullscreen mode Exit fullscreen mode

For example in previous article the same statement here would took around 60s, but using GPU, only took 20-30s (3x-2x faster):

time ollama run codellama 'show me inplace mergesort using golang'
Enter fullscreen mode Exit fullscreen mode

Or from outside:

time docker exec -it `docker ps | grep ollama/ollama | cut -f 1 -d ' '` ollama run codellama 'show me inplace mergesort using golang'
Enter fullscreen mode Exit fullscreen mode

long output

real    0m30.528s
CPU: 0.02s      Real: 21.07s    RAM: 25088KB
Enter fullscreen mode Exit fullscreen mode

NOTE: the answer from codellama above are wrong, since it's not in-place merge sort, also even so it's just normal merge sort using slice would overwrite the underlaying array causing wrong result.

You can also visit http://localhost:3122 for web UI.

this article originally posted here

💖 💪 🙅 🚩
kokizzu
Kiswono Prayogo

Posted on February 27, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

OLLAMA with AMD GPU (ROCm)
ollama OLLAMA with AMD GPU (ROCm)

February 27, 2024