Adding LLM 4bit quantization & update LLM pipeline #32

HonzaCuhel · 2024-02-11T22:00:55Z

This PR includes:

Adding 4bit quantization for Mistral LM
Updating LM prompt classes: generation using HuggingFace pipeline
Adding several unittests

github-actions · 2024-02-11T22:09:23Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
741	361	49%	0%	🟢

New Files

No new covered files...

Modified Files

File	Coverage	Status
datadreamer/pipelines/generate_dataset_from_scratch.py	43%	🟢
datadreamer/prompt_generation/lm_prompt_generator.py	57%	🟢
datadreamer/prompt_generation/prompt_generator.py	86%	🟢
datadreamer/prompt_generation/synonym_generator.py	30%	🟢
datadreamer/prompt_generation/tinyllama_lm_prompt_generator.py	81%	🟢
TOTAL	59%	🟢

updated for commit: e4f26ca by action🐍

github-actions · 2024-02-11T22:12:46Z

Test Results

6 files ± 0 6 suites ±0 40m 50s ⏱️ - 3m 13s
69 tests + 9 27 ✅ + 2 42 💤 + 7 0 ❌ ±0
414 runs +54 162 ✅ +12 252 💤 +42 0 ❌ ±0

Results for commit e4f26ca. ± Comparison against base commit 8dfaeb6.

♻️ This comment has been updated with latest results.

datadreamer/prompt_generation/lm_prompt_generator.py

requirements.txt

datadreamer/prompt_generation/lm_prompt_generator.py

kozlov721

Left some minor comments, otherwise LGTM

sokovninn · 2024-02-12T14:18:28Z

datadreamer/prompt_generation/synonym_generator.py

+        pipe = pipeline(
+            "text-generation",
+            model=model,
+            tokenizer=tokenizer,
+            torch_dtype=torch.float16 if self.device == "cuda" else "auto",
+            device_map=self.device,
+        )
+        print("Done!")
+        return model, tokenizer, pipe


@HonzaCuhel what is the reason for using a pipeline here?

Since I changed the code to use the pipeline in all the other LM classes, I thought it should be consistent everywhere, but I can change it, it's no problem.

Ok, makes sense

sokovninn · 2024-02-12T14:21:28Z

datadreamer/prompt_generation/lm_prompt_generator.py

+    def _test_prompt(self, prompt: str, selected_objects: List[str]) -> bool:
+        """Tests if the generated prompt is valid based on selected objects.
+
+        Args:
+            prompt (str): The generated prompt.
+            selected_objects (List[str]): Objects to check in the prompt.
+
+        Returns:
+            bool: True if the prompt is valid, False otherwise.
+        """
+        return prompt.lower().startswith(
+            "a photo of"
+        )  # and all(obj.lower() in prompt.lower() for obj in selected_objects)


@HonzaCuhel do you have any other ideas on how to ensure the prompt follows the template?

Well, when I was experimenting with TinyLlama I tried several approaches. I tried to command the LM to generate the response that follows this template through the prompt. However, the generated responses weren't satisfactory enough. I also tried to edit the prompt in such a way that I explicitly wrote the words "A photo of" after the instruction end token so that the generated responses started with the template; however, in the case of TinyLlama (I haven't tried it with Mistral) it confused the LM, and it was generating gibberish. There are other ways to restrict the generated tokens; however, when I tried them some time ago, they weren't working very well, but that could change.

When I was experimenting with TinyLlama, I noticed that sometimes the generated prompts were still usable even though they didn't pass the test, e.g. ("A picture of ..." or "Picture showing ..." etc.). Of course, sometimes the prompts weren't good, so this test ensures at least some quality. So maybe we could also include in this test these cases "A picture of," etc., but maybe first try to measure how often this happens and if it is worth adding it.

I could look more into it, but I generally think we should have some simple regex tests to ensure the quality of the generated prompts for their simplicity, the fact that they are fast, and also that they are relatively powerful.

* Add quantization of Mistral * Add quantization flag and add 8bit quantization * Better printing * Update quantization * Add pipeline * Update Mistral LM generation * Update LM quantization * Add unittests & update pipeline & prompt generation * Correct tests * Update version of Mistral, update docstrings & README.md * Format code

HonzaCuhel added 9 commits February 5, 2024 09:12

Add quantization of Mistral

10a8572

Add quantization flag and add 8bit quantization

d7c53b5

Better printing

5a9b205

Update quantization

5694a4b

Add pipeline

d40b203

Update Mistral LM generation

d459352

Update LM quantization

8b46db0

Add unittests & update pipeline & prompt generation

1acd7eb

Correct tests

afdbb57

HonzaCuhel requested review from sokovninn, kozlov721 and tersekmatija February 11, 2024 22:46

kozlov721 reviewed Feb 12, 2024

View reviewed changes

datadreamer/prompt_generation/lm_prompt_generator.py Outdated Show resolved Hide resolved

kozlov721 reviewed Feb 12, 2024

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

kozlov721 reviewed Feb 12, 2024

View reviewed changes

datadreamer/prompt_generation/lm_prompt_generator.py Outdated Show resolved Hide resolved

kozlov721 reviewed Feb 12, 2024

View reviewed changes

sokovninn reviewed Feb 12, 2024

View reviewed changes

HonzaCuhel added 2 commits February 13, 2024 07:39

Update version of Mistral, update docstrings & README.md

deaa47f

Format code

e4f26ca

sokovninn merged commit b016532 into dev Feb 13, 2024
9 checks passed

sokovninn deleted the llm-quantization branch March 7, 2024 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding LLM 4bit quantization & update LLM pipeline #32

Adding LLM 4bit quantization & update LLM pipeline #32

HonzaCuhel commented Feb 11, 2024

github-actions bot commented Feb 11, 2024 •

edited

Loading

github-actions bot commented Feb 11, 2024 •

edited

Loading

kozlov721 left a comment

sokovninn Feb 12, 2024

HonzaCuhel Feb 13, 2024

sokovninn Feb 13, 2024

sokovninn Feb 12, 2024

HonzaCuhel Feb 13, 2024

Adding LLM 4bit quantization & update LLM pipeline #32

Adding LLM 4bit quantization & update LLM pipeline #32

Conversation

HonzaCuhel commented Feb 11, 2024

github-actions bot commented Feb 11, 2024 • edited Loading

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

github-actions bot commented Feb 11, 2024 • edited Loading

Test Results

kozlov721 left a comment

Choose a reason for hiding this comment

sokovninn Feb 12, 2024

Choose a reason for hiding this comment

HonzaCuhel Feb 13, 2024

Choose a reason for hiding this comment

sokovninn Feb 13, 2024

Choose a reason for hiding this comment

sokovninn Feb 12, 2024

Choose a reason for hiding this comment

HonzaCuhel Feb 13, 2024

Choose a reason for hiding this comment

github-actions bot commented Feb 11, 2024 •

edited

Loading

github-actions bot commented Feb 11, 2024 •

edited

Loading