Adding batched generation support and SDXL-Lightning #38

sokovninn · 2024-02-29T22:42:00Z

Added batched prompt generation
Added batched image generation
Added batched image annotation
Added SDXL-Lightning
README modified
LLM quantization fix
OWLv2 processing fix

* Change moving the model * Set max_memory * Update * Update * Remove comments --------- Co-authored-by: Jan Čuhel <[email protected]>

* Add support for TinyLlama * Update * Remove caption reads sentences * [Automated] Updated coverage badge * Update LM tests * Add TinyLlama to README.md * Change helper functions to private --------- Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Nikita Sokovnin <[email protected]>

* Add quantization of Mistral * Add quantization flag and add 8bit quantization * Better printing * Update quantization * Add pipeline * Update Mistral LM generation * Update LM quantization * Add unittests & update pipeline & prompt generation * Correct tests * Update version of Mistral, update docstrings & README.md * Format code

* feature: add batched prompt generation * feature: add --batch_size_prompt argument * test: add simple argument test * feature: add batched prompt generation speed measuring * refactor: remove redundant print * fix: change default batch_size_prompt to 64 * style: black formatting * refactor: typo

* feature: add batched image generation * test: modify image generation tests * fix: modify examples * fix: prompt objects weights * docs: update args description * docs: update prompt generation docstrings * [Automated] Updated coverage badge --------- Co-authored-by: GitHub Actions <[email protected]>

* Add batch annotation * Update docs & add test & fix batched annotation * Change default batch annotation * Fix annotation tests * Fix tests * [Automated] Updated coverage badge * Update annotation example & docstrings * Fix formatting * Fix docstring * [Automated] Updated coverage badge * refactor: replace annotate() with annotate_batch() * feature: replace owlv2 resize --------- Co-authored-by: Jan Cuhel <[email protected]> Co-authored-by: GitHub Actions <[email protected]> Co-authored-by: Nikita Sokovnin <[email protected]>

* feature: add SDXL-Lightning image generator * fix: minor example fix * feature: add prompt weighting to sdxl-lightning * docs: update README * test: add sdxl-lightning tests * [Automated] Updated coverage badge --------- Co-authored-by: GitHub Actions <[email protected]>

github-actions · 2024-02-29T22:56:43Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
883	407	46%	0%	🟢

New Files

File	Coverage	Status
datadreamer/image_generation/sdxl_lightning_image_generator.py	24%	🟢
TOTAL	24%	🟢

Modified Files

File	Coverage	Status
datadreamer/dataset_annotation/image_annotator.py	92%	🟢
datadreamer/dataset_annotation/owlv2_annotator.py	72%	🟢
datadreamer/dataset_annotation/utils.py	40%	🟢
datadreamer/image_generation/init.py	100%	🟢
datadreamer/image_generation/clip_image_tester.py	52%	🟢
datadreamer/image_generation/image_generator.py	25%	🟢
datadreamer/image_generation/sdxl_image_generator.py	20%	🟢
datadreamer/image_generation/sdxl_turbo_image_generator.py	26%	🟢
datadreamer/pipelines/generate_dataset_from_scratch.py	44%	🟢
datadreamer/prompt_generation/lm_prompt_generator.py	58%	🟢
datadreamer/prompt_generation/prompt_generator.py	87%	🟢
datadreamer/prompt_generation/synonym_generator.py	30%	🟢
datadreamer/prompt_generation/tinyllama_lm_prompt_generator.py	82%	🟢
TOTAL	56%	🟢

updated for commit: de11ec6 by action🐍

github-actions · 2024-02-29T23:00:33Z

Test Results

6 files 6 suites 46m 51s ⏱️
76 tests 30 ✅ 46 💤 0 ❌
456 runs 180 ✅ 276 💤 0 ❌

Results for commit de11ec6.

♻️ This comment has been updated with latest results.

HonzaCuhel · 2024-03-01T06:56:01Z

@sokovninn
I have come across these two errors now:

LLM quantization error that this PR fixes
annotation issue when using simple prompting together with LLM synonym generator:

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`input_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

Therefore I'd wait with the release until we solve all of this.

HonzaCuhel · 2024-03-01T08:06:30Z

Fixed the second issue in here.

Co-authored-by: Jan Cuhel <[email protected]>

* Added padding & truncation to the processing * Format code * fix: change labels dtype to int64 * format: black --------- Co-authored-by: Jan Cuhel <[email protected]> Co-authored-by: Nikita Sokovnin <[email protected]>

sokovninn · 2024-03-01T16:05:16Z

All known issues are fixed. Thanks, @HonzaCuhel.
@HonzaCuhel @kozlov721 please review.

HonzaCuhel

LGTM

HonzaCuhel and others added 9 commits February 1, 2024 11:11

Fix map device (#29)

0841c73

* Change moving the model * Set max_memory * Update * Update * Remove comments --------- Co-authored-by: Jan Čuhel <[email protected]>

docs: make image links absolute (#30)

8dfaeb6

docs: add links and gif to README (#37)

c2c0ae2

sokovninn requested review from HonzaCuhel, kozlov721 and tersekmatija February 29, 2024 22:42

sokovninn self-assigned this Feb 29, 2024

sokovninn added 2 commits February 29, 2024 23:46

Merge branch 'main' into dev

0ec8f8c

chore: update version to 0.1.3

a974431

fix: remove duplicate tests

bd40543

sokovninn changed the title ~~Merge dev into main~~ Adding batched generation support and SDXL-Lightning Mar 1, 2024

HonzaCuhel and others added 2 commits March 1, 2024 13:02

Remove load_in_4bit duplicate (#39)

7653e0e

Co-authored-by: Jan Cuhel <[email protected]>

Fix OWLV2 annotator (#40)

de11ec6

* Added padding & truncation to the processing * Format code * fix: change labels dtype to int64 * format: black --------- Co-authored-by: Jan Cuhel <[email protected]> Co-authored-by: Nikita Sokovnin <[email protected]>

HonzaCuhel approved these changes Mar 2, 2024

View reviewed changes

sokovninn merged commit 342945e into main Mar 2, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding batched generation support and SDXL-Lightning #38

Adding batched generation support and SDXL-Lightning #38

sokovninn commented Feb 29, 2024 •

edited

Loading

github-actions bot commented Feb 29, 2024 •

edited

Loading

github-actions bot commented Feb 29, 2024 •

edited

Loading

HonzaCuhel commented Mar 1, 2024 •

edited

Loading

HonzaCuhel commented Mar 1, 2024

sokovninn commented Mar 1, 2024

HonzaCuhel left a comment

Adding batched generation support and SDXL-Lightning #38

Adding batched generation support and SDXL-Lightning #38

Conversation

sokovninn commented Feb 29, 2024 • edited Loading

github-actions bot commented Feb 29, 2024 • edited Loading

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

github-actions bot commented Feb 29, 2024 • edited Loading

Test Results

HonzaCuhel commented Mar 1, 2024 • edited Loading

HonzaCuhel commented Mar 1, 2024

sokovninn commented Mar 1, 2024

HonzaCuhel left a comment

Choose a reason for hiding this comment

sokovninn commented Feb 29, 2024 •

edited

Loading

github-actions bot commented Feb 29, 2024 •

edited

Loading

github-actions bot commented Feb 29, 2024 •

edited

Loading

HonzaCuhel commented Mar 1, 2024 •

edited

Loading