Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding batched generation support and SDXL-Lightning #38

Merged
merged 14 commits into from
Mar 2, 2024
Merged

Adding batched generation support and SDXL-Lightning #38

merged 14 commits into from
Mar 2, 2024

Conversation

sokovninn
Copy link
Member

@sokovninn sokovninn commented Feb 29, 2024

  • Added batched prompt generation
  • Added batched image generation
  • Added batched image annotation
  • Added SDXL-Lightning
  • README modified
  • LLM quantization fix
  • OWLv2 processing fix

HonzaCuhel and others added 9 commits February 1, 2024 11:11
* Change moving the model

* Set max_memory

* Update

* Update

* Remove comments

---------

Co-authored-by: Jan Čuhel <[email protected]>
* Add support for TinyLlama

* Update

* Remove caption reads sentences

* [Automated] Updated coverage badge

* Update LM tests

* Add TinyLlama to README.md

* Change helper functions to private

---------

Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Nikita Sokovnin <[email protected]>
* Add quantization of Mistral

* Add quantization flag and add 8bit quantization

* Better printing

* Update quantization

* Add pipeline

* Update Mistral LM generation

* Update LM quantization

* Add unittests & update pipeline & prompt generation

* Correct tests

* Update version of Mistral, update docstrings & README.md

* Format code
* feature: add batched prompt generation

* feature: add --batch_size_prompt argument

* test: add simple argument test

* feature: add batched prompt generation speed measuring

* refactor: remove redundant print

* fix: change default batch_size_prompt to 64

* style: black formatting

* refactor: typo
* feature: add batched image generation

* test: modify image generation tests

* fix: modify examples

* fix: prompt objects weights

* docs: update args description

* docs: update prompt generation docstrings

* [Automated] Updated coverage badge

---------

Co-authored-by: GitHub Actions <[email protected]>
* Add batch annotation

* Update docs & add test & fix batched annotation

* Change default batch annotation

* Fix annotation tests

* Fix tests

* [Automated] Updated coverage badge

* Update annotation example & docstrings

* Fix formatting

* Fix docstring

* [Automated] Updated coverage badge

* refactor: replace annotate() with annotate_batch()

* feature: replace owlv2 resize

---------

Co-authored-by: Jan Cuhel <[email protected]>
Co-authored-by: GitHub Actions <[email protected]>
Co-authored-by: Nikita Sokovnin <[email protected]>
* feature: add SDXL-Lightning image generator

* fix: minor example fix

* feature: add prompt weighting to sdxl-lightning

* docs: update README

* test: add sdxl-lightning tests

* [Automated] Updated coverage badge

---------

Co-authored-by: GitHub Actions <[email protected]>
Copy link

github-actions bot commented Feb 29, 2024

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
883 407 46% 0% 🟢

New Files

File Coverage Status
datadreamer/image_generation/sdxl_lightning_image_generator.py 24% 🟢
TOTAL 24% 🟢

Modified Files

File Coverage Status
datadreamer/dataset_annotation/image_annotator.py 92% 🟢
datadreamer/dataset_annotation/owlv2_annotator.py 72% 🟢
datadreamer/dataset_annotation/utils.py 40% 🟢
datadreamer/image_generation/init.py 100% 🟢
datadreamer/image_generation/clip_image_tester.py 52% 🟢
datadreamer/image_generation/image_generator.py 25% 🟢
datadreamer/image_generation/sdxl_image_generator.py 20% 🟢
datadreamer/image_generation/sdxl_turbo_image_generator.py 26% 🟢
datadreamer/pipelines/generate_dataset_from_scratch.py 44% 🟢
datadreamer/prompt_generation/lm_prompt_generator.py 58% 🟢
datadreamer/prompt_generation/prompt_generator.py 87% 🟢
datadreamer/prompt_generation/synonym_generator.py 30% 🟢
datadreamer/prompt_generation/tinyllama_lm_prompt_generator.py 82% 🟢
TOTAL 56% 🟢

updated for commit: de11ec6 by action🐍

Copy link

github-actions bot commented Feb 29, 2024

Test Results

  6 files    6 suites   46m 51s ⏱️
 76 tests  30 ✅  46 💤 0 ❌
456 runs  180 ✅ 276 💤 0 ❌

Results for commit de11ec6.

♻️ This comment has been updated with latest results.

@sokovninn sokovninn changed the title Merge dev into main Adding batched generation support and SDXL-Lightning Mar 1, 2024
@HonzaCuhel
Copy link
Contributor

HonzaCuhel commented Mar 1, 2024

@sokovninn
I have come across these two errors now:

  • LLM quantization error that this PR fixes
  • annotation issue when using simple prompting together with LLM synonym generator:
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`input_ids` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

Therefore I'd wait with the release until we solve all of this.

@HonzaCuhel
Copy link
Contributor

Fixed the second issue in here.

HonzaCuhel and others added 2 commits March 1, 2024 13:02
* Added padding & truncation to the processing

* Format code

* fix: change labels dtype to int64

* format: black

---------

Co-authored-by: Jan Cuhel <[email protected]>
Co-authored-by: Nikita Sokovnin <[email protected]>
@sokovninn
Copy link
Member Author

All known issues are fixed. Thanks, @HonzaCuhel.
@HonzaCuhel @kozlov721 please review.

Copy link
Contributor

@HonzaCuhel HonzaCuhel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sokovninn sokovninn merged commit 342945e into main Mar 2, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants