We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用CPU做推理时能够获得输出,但是使用GPU推理时在显卡利用率100%的情况下运行一个小时也没有输出结果。 代码如下:
from transformers import AutoModelForCausalLM, AutoTokenizer model_path = "./model" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForCausalLM.from_pretrained(model_path, device_map='cuda:0') text = ["What's the function of Aspirin?"] input = tokenizer(text, truncation=True, return_tensors="pt").to("cuda:0") output = model.generate(inputs=input.input_ids, max_new_tokens=128, early_stopping=True) print(tokenizer.decode(output[0]))
The text was updated successfully, but these errors were encountered:
遇到同样的问题,请问您解决了吗?
Sorry, something went wrong.
No branches or pull requests
使用CPU做推理时能够获得输出,但是使用GPU推理时在显卡利用率100%的情况下运行一个小时也没有输出结果。
代码如下:
The text was updated successfully, but these errors were encountered: