-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: torch_replacement_knn_gpu() got an unexpected keyword argument 'device' #25
Comments
I have the same issue with you, and here's my script: |
Hi @jordancole21 and @kekekawaii2839 , We developed this with the newest versions of In your case, it seems that maybe the Best, |
Ok thank you! For some reason it looks like the latest version of faiss I can get in Google Colab is 1.7.2 but I'll see if I can find a way to get 1.7.4 to work! |
Thank you! In my case, faiss-gpu has version 1.7.3 using |
Great, let us know if you have any questions! |
Ok finally got it to work in Google Colab on an A100 40G. For anyone curious I used StableBeluga-13B and it took around 9 minutes to get a summary of Harry Potter, which is pretty good especially since you can't even fit the full book in Claude 100k! I'm thoroughly impressed! Here is the Code I used to get it working in Colab: First, in order to get the latest version of Faiss you have to upgrade Python to 3.10 since it's automatically set to 3.7
Then you'll want to install mini-conda so that you can use install faiss using conda.
Install using conda-forge
Then clone the repo in colab
Install the requirements. (In this instance some of these aren't required but I liked to have them just in case)
Cd into the src folder in Unlimiformer
Then you should be good to run the script! Just be sure that the --index_devices and --datastore_device are set correctly. In my case I set them to 0
This worked pretty well after I set the --layer_begin to 22 (a little over half the number of layers in the model). Here's the summary:
Thanks again for all the hard work you and your team did @urialon I'm pretty hyped about this! |
Cool! I'm using llama2-7b-chat-hf on an A100 40G too, and wondering how to solve CUDA Out of Memory error. For me, adding |
awesome! I'm glad to hear! Let us know if you have any more questions. Best, |
sorry to bother again, I'm using command below:
And the output is very strange:
Full logs for llama-2-7b:
Also, for
Output:
Why this happen? Is this because 7b version is less capable than 13b version? Or there's some other reason? |
Hi @kekekawaii2839 , I'm not sure. We do see that LLama-13B works better, but we also see a large variance when using different values of By the way, why are you using this HTML-escaping in the prompt? Best, |
I'm sorry, that's a copy error, and I'm using the right prompt acutally.
I've tried different values of
It seems the model didn't understand the summarization instruction. |
I suggest trying larger values as well |
Thanks, I tried larger values, but unfortunately, they didn't work on 7b models :( And also, @jordancole21 , can you tell me more about running 13b model on A100 40G? Every time I run 13b model using the command on 3 A100 40G:
I encounter with CUDA OOM error. |
OK, now I use a shorter input of 80k tokens in 7b model, and here's the result:
I'm very excited with this, and thank you @urialon and your team for making this wonderful work and helping me solving many problems! |
Amazing @kekekawaii2839 ! Just for future reference, command line did you use to generate the last output? (I am curious about the exact model, exact Best, |
here's the command for the 80k tokens input:
And surprisingly, I modified the instruction a little in the prefix and using a 135k tokens input to test again. Here's the result:
Amazing! And the command for above:
But it's weird that the model's output for summarizing harry potter is still strange although using the same flags as the above cookbook:
(Yes, the model indeed output a lot of |
Hello,
File "python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 402, in forward |
Hey looks like I'm having some issues working with Llama models. This is the modified script I'm using:
But I get this error:
Any ideas on how to fix that?
Thanks again for all the help and for the new features!
The text was updated successfully, but these errors were encountered: