- Download and install Ollama.
- Run a model in
ollama
ollama run llama3.2
- Run Open WebUI with Nvidia GPU support
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda
- Navigate to http://localhost:3000/
- First time, you will need to create an account
docker container stop open-webui
docker container rm open-webui
docker pull ghcr.io/open-webui/open-webui:cuda
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda