You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I run it by python -m sglang.bench_latency --model-path TheBloke/TinyLlama-1.1B-Chat-v0.3-GPTQ --batch 1 --inp ut-len 1 --output-len 512 --trust-remote-code --lora-paths jashing/tinyllama-colorist-lora.
It seems no problem, is there any way to help me verify that this is the correct thing to do?
I verified the results of QLoRA on my personal fine-tuning task (deployed as a service using the above command), and concluded that QLoRA did not actually take effect. I believe it's better to throw an error to alert users and avoid misunderstandings. @Ying1123
merrymercy
changed the title
sgl & qlora
[Feature] Support QLoRA weights
Nov 1, 2024
Does sgl support qlora? Could you provide some instructions on how to use it?
The text was updated successfully, but these errors were encountered: