Skip to content

Commit ad52d5c

Browse files
Vaibhavs10julien-cggerganov
authored
doc: add references to hugging face GGUF-my-repo quantisation web tool. (#7288)
* chore: add references to the quantisation space. * fix grammer lol. * Update README.md Co-authored-by: Julien Chaumond <[email protected]> * Update README.md Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Julien Chaumond <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
1 parent 172b782 commit ad52d5c

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

README.md

+3
Original file line numberDiff line numberDiff line change
@@ -712,6 +712,9 @@ Building the program with BLAS support may lead to some performance improvements
712712

713713
### Prepare and Quantize
714714

715+
> [!NOTE]
716+
> You can use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to quantise your model weights without any setup too. It is synced from `llama.cpp` main every 6 hours.
717+
715718
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
716719

717720
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.

examples/quantize/README.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# quantize
22

3-
TODO
3+
You can also use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to build your own quants without any setup.
4+
5+
Note: It is synced from llama.cpp `main` every 6 hours.
46

57
## Llama 2 7B
68

0 commit comments

Comments
 (0)