-
Some models such as the QWEN 2.5 Coder appears to be uploaded in multiple parts, see here: https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct-GGUF/tree/main I can see that the q8_0 (8-bit quantized model) is a single big gguf file, but the FP16 one consists of multiple smaller gguf files. (qwen2.5-coder-32b-instruct-fp16-00001-of-00009.gguf, qwen2.5-coder-32b-instruct-fp16-00002-of-00009.gguf, etc.) In such a case, how can I pull and use the model? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
You can use the |
Beta Was this translation helpful? Give feedback.
You can use the
pull
command to download such models; just point to the first file, and it'll download all the required files.To use it, load the first file, and it'll handle the rest automatically.