Nitro approach for inferencing on ARM or TPU #2223
Unanswered
dattachandan
asked this question in
Get Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi team
While searching for fast inferencing engines for lightweight LLM models, I came across this project and would like to know if anyone has tried it on edge processors such as M1, Snapdragon, Coral TPU or any ARM or ASIC architectures such as( AI accelerators such as Sambanova, Cerebras, Graphcore and Habana Gaudi).
What's the Nitro approach compared to XLA project on the edge devices for supporting various DL frameworks or is it a standalone engine with no integration to PyTorch or TF so that existing code there can be brought in?
Had a look at the roadmap but couldn't find the right story for the architectural approach
Couldn't find the arch document
Beta Was this translation helpful? Give feedback.
All reactions