Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

推理时候显存分配 #163

Open
xxcoco763 opened this issue Dec 28, 2023 · 2 comments
Open

推理时候显存分配 #163

xxcoco763 opened this issue Dec 28, 2023 · 2 comments

Comments

@xxcoco763
Copy link

模型推理的时候使用device_map="auto" 把模型分到了各张卡上,但是文本加载的显存直接全部放到0号卡了,可以将这部分显存也平均分配吗?

@xxcoco763
Copy link
Author

运行的是 webui
代码如下:
python3 demo.py
--base_model /home/models/LongAlpaca-7B
--context_size 32768
--max_gen_len 1024 \

@Zhangchaoran000
Copy link

同样我想提问,是如何在单卡测试的64K的输入长度啊,使用40G的A100只能够运行16K的长度输入

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants