Skip to content

Releases: LLukas22/llm-rs-python

GGML quantization update

25 May 09:23
19a4788
Compare
Choose a tag to compare

⚠️ The GGML quantization format was updated again, old models will be incompatible ⚠️

Huggingface Hub integrations into AutoModel

24 May 14:12
d55b04d
Compare
Choose a tag to compare

AutoModel can now automatically download GGML converted models and normal Transformer models from the Huggingface Hub.

AutoConverter, AutoQuantizer and AutoModel

23 May 15:43
da46617
Compare
Choose a tag to compare

Added the ability to automatically convert any supported model from the Huggingface Hub via the AutoConverter.

Models which were converted this way, can be easily quantized or loaded via the AutoQuantizer or AutoModel without the need to specifiy the architecture.

Added quantization support

19 May 16:33
c691fee
Compare
Choose a tag to compare

The ability to quantize models is now available for every architecture via quantize.

LoRA & MPT Support

19 May 09:40
d12f123
Compare
Choose a tag to compare
  • Added support for Mosaic ML's MPT models.
  • Added support for LoRA adapters for all architectures.

⚠️Caution⚠️
Due to changes in the ggml format old quantized models are not supported anymore!

Tokenization & GIL free generation

08 May 14:10
659a3d2
Compare
Choose a tag to compare

Added the tokenize and decode functions to each model, to enable access to the internal tokenizer.

The generation of tokens is now GIL free, meaning other background threads can run at the same time.

Support Multiple Model Architectures

03 May 15:17
640e7fc
Compare
Choose a tag to compare

Since llama-rs was renamed to llm and now supports multiple model architectures, this wrapper was also expanded to support the new trait system and library structure.

Supported architectures for now:

  • Llama
  • GPT2
  • GPTJ
  • GPT-NeoX
  • Bloom

The loader was also reworked and now supports the mmap-able ggjt. To support this the SessionConfig was expandend with the prefer_mmap field.

Added SessionConfig options for Model

19 Apr 10:26
Compare
Choose a tag to compare
0.0.2

Added SessionConfig

Basic Functionality

18 Apr 19:06
e7c71e8
Compare
Choose a tag to compare
0.0.1

Update Cargo.toml