Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(mig): fallback gpu_memory_total value #3353

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tomheno
Copy link

@tomheno tomheno commented Feb 6, 2025

Motivation

It's currently not possible to run on on MiG paritionned GPU as those have insufficient permission when using nvidia-smi cli to access available memory.
Tested on H100 & H200.

See #2933

Modifications

Added a SGLANG_GPU_MEMORY_TOTAL_FALLBACK environment variable to manually set available memory when nvidia-smi is not possible

@zhyncs
Copy link
Member

zhyncs commented Feb 7, 2025

@dsingal0 what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants