Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using uqff downloads the full safetensors weights. #828

Open
chigkim opened this issue Oct 5, 2024 · 2 comments
Open

Using uqff downloads the full safetensors weights. #828

chigkim opened this issue Oct 5, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@chigkim
Copy link

chigkim commented Oct 5, 2024

Describe the bug

I ran the following, and it downloaded both uqff as well as full weights from Meta. I tried to skip -m, but it seems -m is required.

./mistralrs-server -i vision-plain -m meta-llama/Llama-3.2-11B-Vision-Instruct -a vllama --from-uqff EricB/Llama-3.2-11B-Vision-Instruct-UQFF/llama-3.2-11b-vision-hqq8.uqff

Latest commit or version

3e79d85

@chigkim chigkim added the bug Something isn't working label Oct 5, 2024
@vietvudanh
Copy link

vietvudanh commented Oct 8, 2024

I had same problem.

Not sure if I read rust code correctly but it seems Args in mistralrs-server/src/main.rs has no option for uqff. Only has model, which loads in this case meta-llama/Llama-3.2-11B-Vision-Instruct, so full tensor of base model.

@EricLBuehler
Copy link
Owner

EricLBuehler commented Oct 15, 2024

@chigkim @vietvudanh you can now load UQFF models without downloading the full weights in #849!

For example (https://huggingface.co/EricB/Llama-3.2-11B-Vision-Instruct-UQFF):

./mistralrs-server -i vision-plain -m EricB/Llama-3.2-11B-Vision-Instruct-UQFF -a vllama --from-uqff llama3.2-vision-instruct-q4k.uqff

More models can be found here: https://huggingface.co/collections/EricB/uqff-670e4a49d56ecdd3f7f0fd4c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants