Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tgi server launch fails with latest-rocm docker image. #2522

Open
3 of 4 tasks
gurpreet-dhami opened this issue Sep 13, 2024 · 0 comments
Open
3 of 4 tasks

tgi server launch fails with latest-rocm docker image. #2522

gurpreet-dhami opened this issue Sep 13, 2024 · 0 comments

Comments

@gurpreet-dhami
Copy link

gurpreet-dhami commented Sep 13, 2024

System Info

text-generation-launcher 2.2.1-dev0

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

This one works
docker run --rm -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -p 8080:80 --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 256g -v $PWD:/data --env PYTORCH_TUNABLEOP_ENABLED=0 --env HUGGINGFACE_HUB_CACHE=/data --env ROCM_USE_FLASH_ATTN_V2_TRITON=0 ghcr.io/huggingface/text-generation-inference:2.2.0-rocm --model-id=teknium/OpenHermes-2.5-Mistral-7B

This one fails
docker run --rm -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -p 8080:80 --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 256g -v $PWD:/data --env PYTORCH_TUNABLEOP_ENABLED=0 --env HUGGINGFACE_HUB_CACHE=/data --env ROCM_USE_FLASH_ATTN_V2_TRITON=0 ghcr.io/huggingface/text-generation-inference:latest-rocm --model-id=teknium/OpenHermes-2.5-Mistral-7B

from text_generation_server.layers.attention.flashinfer import (

File "/opt/conda/lib/python3.11/site-packages/text_generation_server/layers/attention/flashinfer.py", line 5, in

import flashinfer

ModuleNotFoundError: No module named 'flashinfer'

2024-09-13T18:58:43.898755Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

Expected behavior

It should launch the server successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant