-
I got no real GPU on my homelab so i would be interested if there's support for TPU? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
hi @fwartner, @hiro-v would it be possible for us to support this user somehow? thank you |
Beta Was this translation helpful? Give feedback.
-
I guess supporting a Google Coral would be pretty beneficial for all folks running this on a central homelab server, since they don’t have a dedicated gpu for these stuff. additionally a coral is pretty affordable. |
Beta Was this translation helpful? Give feedback.
-
I agree with @Disane87 (Who actually brought me to Jan ^^).. If needed: I would love to sponsor a Coral TPU for testing purposes! |
Beta Was this translation helpful? Give feedback.
-
As far as I am aware, the Coral TPU won't help here as it is only made for small models, but LLMs are large, as the name implies and by that too big for the ram on the TPU and streaming the model is also too slow. |
Beta Was this translation helpful? Give feedback.
-
Hi @mawoka-myblock and @Disane87 Currently we do not have plan for supporting TPU, as our current local inference engines are There are several reference here:
|
Beta Was this translation helpful? Give feedback.
-
Let's continue to keep this discussion thread open. I'm intrigued by the potential of Coral TPU.
|
Beta Was this translation helpful? Give feedback.
Hi @mawoka-myblock and @Disane87
Currently we do not have plan for supporting TPU, as our current local inference engines are
llama.cpp
andnvidia tensorrt llm
do not support those. We might support Python runtime and we can discover the possibility of running on TPUThere are several reference here:
https://github.com/ggerganov/llama.cpp/issues/1052#issuecomment-1515339426
https://github.com/ggerganov/llama.cpp/issues/3253