-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU vs GPU #100
Comments
for now no i waiting to ... he is tooo slowly with CPU |
Yeah, CUDA gpu support would be great. That and leaving the model in the vram for the duration of the conversation. |
So.. Slow for everyone? |
This is a llama.cpp issue, not Serge. May be worth creating this issue on that repo. |
llama.cpp is specifically intended by be a CPU-only implementation. They've said before that they won't support GPU. Someone would have to submit a patch so good that they're willing to accept it anyway, or else I suppose this project would have to find a similar, but GPU-based version of it. |
Is there an option to change this platform to use CUDA or ROCM instead of CPU?
The text was updated successfully, but these errors were encountered: