Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support non-cuda devices for text and vision models #233

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dvrogozh
Copy link
Contributor

@dvrogozh dvrogozh commented Dec 3, 2024

This commit adds support of non-cuda pytorch backend devices to text and vision models. Commit extends existing tests to run for the externally specified device (cuda is a default). Commit verified on Llama3.2-3B-Instruct and Llama3.2-11B-Vision-Instruct models for:

  • cuda device type on NVidia A10 GPU (for no regressions)
  • cpu device type
  • xpu device type on Intel Data Center Max Series GPU (PVC)

Note that this commit requires a fix on pytorch side for gloo torch distributed backend to restore TLS on gloo working threads. This change was merged on pytorch side and should make it to pytorch 2.6.

This PR supersedes #165 from @anordin95.

Requires: pytorch/pytorch#142184

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 3, 2024
dvrogozh added a commit to dvrogozh/llama-stack that referenced this pull request Dec 3, 2024
@dvrogozh dvrogozh changed the title feat: support non-cuda devices for text models feat: support non-cuda devices for text and vision models Dec 6, 2024
@dvrogozh
Copy link
Contributor Author

dvrogozh commented Dec 6, 2024

Updated PR with the changes needed to make vision models working (see 2nd commit). For the test executed the following by commenting out skip conditions:

@unittest.skip("Disabling vision model test")
@pytest.mark.skip(reason="Disabling vision model test")
def test_run_generation(self):

@ashwinb
Copy link
Contributor

ashwinb commented Jan 14, 2025

Thanks @dvrogozh -- could you add a test plan here which shows that the example scripts we have work correctly for both CUDA and non-CUDA environments?

dvrogozh and others added 3 commits January 14, 2025 22:34
This commit adds support of non-cuda pytorch backend devices
to text models. Commit extends existing test to run for the
externally specified device (cuda is a default). Commit verified on
Llama3.2-3B-Instruct model for:
* "cuda" device type on NVidia A10 GPU
* "cpu" device type
* "xpu" device type on Intel Data Center Max Series GPU (PVC)

Co-authored-by: anordin95 <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
This commit adds support of non-cuda pytorch backend devices
to vision models. Commit verified on Llama3.2-11B-Vision-Instruct
model for:
* "cuda" device type on NVidia A10 GPU
* "cpu" device type
* "xpu" device type on Intel Data Center Max Series GPU (PVC)

Note that this commit requires a fix on pytorch side for gloo
torch distributed backend to restore TLS on gloo working threads.

Requires: pytorch/pytorch#142184
Signed-off-by: Dmitry Rogozhkin <[email protected]>
This change modifies a test for the reference inference in a way
that on the cpu inference is always tested and on-device is tested
if device is available (currently checking for cuda and xpu in that
order) or if user explicitly specified DEVICE to test via environment
variable.

Signed-off-by: Dmitry Rogozhkin <[email protected]>
@dvrogozh
Copy link
Contributor Author

@ashwinb : I have modified existing tests to make them cover both CUDA and non-CUDA environments. In the last commit I further extend a test to run 2 cases: 1) run inference on cpu, 2) run inference on an acceleration device. Device will be automatically detected and used, cuda or xpu in that order. Additionally it's possible to specify device via DEVICE environment variables. Tests can be executed as follows:

python -m pytest models/llama3/tests/api/test_generation.py

Note that I covered both text and visual models. However, at the moment visual model tests are being skipped:

@unittest.skip("Disabling vision model test")
@pytest.mark.skip(reason="Disabling vision model test")

That's effective starting from 6ad6fd6 commit. If skips to be removed, I suggest to do that in a separate PR.

@dvrogozh
Copy link
Contributor Author

@ashwinb, can you, please, help to review and comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants