feat: support volta architecture GPUs for the vLLM backend #930

K-Mistele · 2024-03-12T16:07:30Z

Feature request

It would be great if OpenLLM supported pre-Ampere architecture Cuda devices. In my case, I'm looking at the volta architecture.

The README currently indicates that an Ampere-architecture or newer GPU is required to use the vLLM backend, otherwise you're stuck with the torch backend.

As far as I can tell, this is not a vLLM-specific constraint - vLLM does not require that you use an ampere-arch device.

Motivation

I am trying to run OpenLLM on my Nvidia Tesla v100 (32GB) devices, but I cannot use the vLLM backend, as OpenLLM's vLLM backend does not support the volta architecture.

Other

I would love to help as best as I can, but I can't find any documentation for where this constraint comes from, other than in the README. I've gone through vLLM's docs, and they do not indicate that this is a vLLM constraint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support volta architecture GPUs for the vLLM backend #930

feat: support volta architecture GPUs for the vLLM backend #930

K-Mistele commented Mar 12, 2024

feat: support volta architecture GPUs for the vLLM backend #930

feat: support volta architecture GPUs for the vLLM backend #930

Comments

K-Mistele commented Mar 12, 2024

Feature request

Motivation

Other