Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Running Meta-Llama/Llama-3.3-70B-Instruct Model on Tesla V100 GPU with Ray Cluster and vLLM #254

Open
btarmadmin-1954 opened this issue Jan 6, 2025 · 0 comments

Comments

@btarmadmin-1954
Copy link

btarmadmin-1954 commented Jan 6, 2025

I deployed the Meta-Llama/Llama-3.3-70B-Instruct model on a Ray cluster using Tesla V100-SXM2-32GB GPUs with vLLM. However, when I send a query, I encounter the following error:

python: /project/lib/Analysis/Allocation.cpp:47: std::pair<llvm::SmallVector, llvm::SmallVector > mlir::triton::getCvtOrder(mlir::Attribute, mlir::Attribute): Assertion `!(srcMmaLayout && dstMmaLayout && !srcMmaLayout.isAmpere()) && "mma -> mma layout conversion is only supported on Ampere"' failed.

*** SIGABRT received at time=1736148493 on cpu 1 ***
PC: @ 0x7fce07e9eb1c (unknown) pthread_kill
@ 0x7fce07e45320 (unknown) (unknown)
@ 0x7fce07e4526e 32 raise
@ 0x7fce07e288ff 192 abort
@ 0x7fce07e2881b 96 (unknown)
@ 0x7fce07e3b507 48 __assert_fail
@ 0x7fccc4cbe42a (unknown) mlir::triton::getScratchConfigForCvtLayout()
@ 0x300000001 (unknown) (unknown)
@ 0x27106e90 (unknown) (unknown)
@ 0x7fccc8df9fd7 (unknown) (unknown)
@ 0x7fccc4d6b9c0 (unknown) (unknown)
@ 0x9000623991e90789 (unknown) (unknown)
[2025-01-06 10:28:13,840 E 173516 173516] logging.cc:447: *** SIGABRT received at time=1736148493 on cpu 1 ***
[2025-01-06 10:28:13,840 E 173516 173516] logging.cc:447: PC: @ 0x7fce07e9eb1c (unknown) pthread_kill
...
Fatal Python error: Aborted

Is the Meta-Llama/Llama-3.3-70B-Instruct model compatible with Tesla V100-SXM2-32GB GPUs? If yes, what configurations or optimizations might resolve this issue?

I also tried the meta-llama/Llama-3.1-8B-Instruct and meta-llama--Llama-3.2-1B models and they give the same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant