Does it support a TPU? #2497

fwartner · 2024-03-25T11:43:20Z

fwartner
Mar 25, 2024

I got no real GPU on my homelab so i would be interested if there's support for TPU?

Mar 26, 2024

Currently we do not have plan for supporting TPU, as our current local inference engines are llama.cpp and nvidia tensorrt llm do not support those. We might support Python runtime and we can discover the possibility of running on TPU

There are several reference here:

NVIDIA Tensorrt-llm of course only supports NVIDIA GPU
For llama.cpp
- https://github.com/ggerganov/llama.cpp/issues/1052#issuecomment-1515339426
- https://github.com/ggerganov/llama.cpp/issues/3253

View full answer

Van-QA · 2024-03-25T11:57:16Z

Van-QA
Mar 25, 2024

hi @fwartner,
Unfortunately, we don't support TPU atm due to a lack of hardware for testing purposes.

@hiro-v would it be possible for us to support this user somehow? thank you

0 replies

Disane87 · 2024-03-25T12:18:49Z

Disane87
Mar 25, 2024

I guess supporting a Google Coral would be pretty beneficial for all folks running this on a central homelab server, since they don’t have a dedicated gpu for these stuff.

additionally a coral is pretty affordable.

0 replies

fwartner · 2024-03-25T19:20:13Z

fwartner
Mar 25, 2024
Author

I agree with @Disane87 (Who actually brought me to Jan ^^)..

If needed: I would love to sponsor a Coral TPU for testing purposes!

0 replies

mawoka-myblock · 2024-03-25T21:22:36Z

mawoka-myblock
Mar 25, 2024

As far as I am aware, the Coral TPU won't help here as it is only made for small models, but LLMs are large, as the name implies and by that too big for the ram on the TPU and streaming the model is also too slow.

0 replies

hiro-v · 2024-03-26T03:17:20Z

hiro-v
Mar 26, 2024

Hi @mawoka-myblock and @Disane87

Currently we do not have plan for supporting TPU, as our current local inference engines are llama.cpp and nvidia tensorrt llm do not support those. We might support Python runtime and we can discover the possibility of running on TPU

There are several reference here:

NVIDIA Tensorrt-llm of course only supports NVIDIA GPU
For llama.cpp
- https://github.com/ggerganov/llama.cpp/issues/1052#issuecomment-1515339426
- https://github.com/ggerganov/llama.cpp/issues/3253

0 replies

dan-homebrew · 2024-03-26T03:23:15Z

dan-homebrew
Mar 26, 2024
Maintainer

Let's continue to keep this discussion thread open. I'm intrigued by the potential of Coral TPU.

We are focused on refactoring our existing engines (e.g. TensorRT-LLM, OpenAI, etc) into an Inference Engine SDK
This should make it possible for the community to build a Coral TPU Inference Engine pretty easily
This will require some time, as TensorFlow is an entirely separate ecosystem

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jan

Does it support a TPU? #2497

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Jan

Does it support a TPU? #2497

fwartner Mar 25, 2024

Replies: 6 comments

Van-QA Mar 25, 2024

Disane87 Mar 25, 2024

fwartner Mar 25, 2024 Author

mawoka-myblock Mar 25, 2024

hiro-v Mar 26, 2024

dan-homebrew Mar 26, 2024 Maintainer

fwartner
Mar 25, 2024

Van-QA
Mar 25, 2024

Disane87
Mar 25, 2024

fwartner
Mar 25, 2024
Author

mawoka-myblock
Mar 25, 2024

hiro-v
Mar 26, 2024

dan-homebrew
Mar 26, 2024
Maintainer