Image Language Models and `ImageGeneration` task #1060

plaguss · 2024-11-14T11:58:31Z

Description

This PR adds a new module to models: models/image_generation to store image models (InferenceEndpointsImageGeneration and OpenAIImageGeneration), with 2 new base classes: ImageGenerationModel and AsyncImageGenerationModel, and a new ImageGeneration task.

Sample pipeline and dataset. Take into account the distiset.transform_columns_to_image method, necessary to push the dataset with the images as objects instead of strings.

from datasets import load_dataset

from distilabel.models.image_generation import InferenceEndpointsImageGeneration
from distilabel.pipeline import Pipeline
from distilabel.steps import KeepColumns
from distilabel.steps.tasks import ImageGeneration

ds = load_dataset("dvilasuero/finepersonas-v0.1-tiny", split="train").select(range(3))

with Pipeline(name="image_generation_pipeline") as pipeline:
    igm = InferenceEndpointsImageGeneration(model_id="black-forest-labs/FLUX.1-schnell")

    img_generation = ImageGeneration(
        name="flux_schnell", image_generation_model=igm, input_mappings={"prompt": "persona"}
    )

    keep_columns = KeepColumns(columns=["persona", "model_name", "image"])

    img_generation >> keep_columns


if __name__ == "__main__":
    distiset = pipeline.run(use_cache=False, dataset=ds)
    # Save the images as `PIL.Image.Image`
    distiset = distiset.transform_columns_to_image("image")
    distiset.push_to_hub("plaguss/test-finepersonas-v0.1-tiny-flux-schnell")

github-actions · 2024-11-14T11:59:58Z

Documentation for this PR has been built. You can view it at: https://distilabel.argilla.io/pr-1060/

codspeed-hq · 2024-11-14T12:07:07Z

CodSpeed Performance Report

Merging #1060 will not alter performance

_{Comparing vision-language-models (e9e6790) with develop (a8d02c2)}

Summary

✅ 1 untouched benchmarks

…ore pushing to the hub

davidberenstein1957 · 2024-11-18T16:11:53Z

@burtenshaw

davidberenstein1957 · 2024-11-20T09:34:28Z

docs/sections/how_to_guides/advanced/distiset.md

+if __name__ == "__main__":
+    distiset = pipeline.run(use_cache=False, dataset=ds)
+    # Save the images as `PIL.Image.Image`
+   distiset = distiset.transform_columns_to_image("image")


Suggested change

+ distiset = distiset.transform_columns_to_image("image")

distiset = distiset.transform_columns_to_image("image")

davidberenstein1957 · 2024-12-19T11:33:27Z

pyproject.toml

@@ -102,6 +102,7 @@ text-clustering = [
    "scikit-learn >= 1.4.1",
    "matplotlib >= 3.8.3",   # For the figure (even though it's optional)
 ]
+vision = ["Pillow >= 10.3.0"]  # To work with images.


I just go a pil error due to some imports

/Users/davidberenstein/Documents/programming/argilla/distilabel/.venv/lib/python3.11/site-packag │ │ es/distilabel/steps/tasks/text_generation_with_image.py:18 in <module> │ │ │ │ 15 from typing import TYPE_CHECKING, Any, Literal, Union │ │ 16 │ │ 17 from jinja2 import Template │ │ ❱ 18 from PIL import Image │ │ 19 from pydantic import Field │ │ 20 │ │ 21 from distilabel.steps.tasks.base import Task │ │ │ │ ╭──────────── locals ────────────╮ │ │ │ Literal = typing.Literal │ │ │ │ TYPE_CHECKING = False │ │ │ │ Union = typing.Union │ │ │ ╰────────────────────────────────╯ │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ModuleNotFoundError: No module named 'PIL'

plaguss added 12 commits November 13, 2024 12:44

Add PIL for image processing

1f5e271

Add module to store vision language models

4fa1c10

First version of text-to-image with inference endpoints

b2d858a

Add text-to-image with OpenAI

4733c7d

Add image generation task

6164b3c

Redirect imports

b201baf

Redirect imports

88b8d51

Add image-generation icon

06db4e2

Add vision language models

6fe997a

Move vlms to ilms to make it the name more explicit

18cd75b

Update vlms to ilms

e8cfac5

Add image language models to the components gallery

727d5aa

plaguss added the enhancement New feature or request label Nov 14, 2024

plaguss added this to the 1.5.0 milestone Nov 14, 2024

plaguss self-assigned this Nov 14, 2024

plaguss requested a review from gabrielmbmb November 14, 2024 11:58

plaguss added 5 commits November 14, 2024 16:59

Refactor ILM and fix image saves when save_images=False

0c97eeb

Add example

0aaec8c

Update task to work saving images as JPEG artifact and raw base64 string

00af941

Add short tutorial example for image generation

5629464

Update examples with correct output format

ffed25e

plaguss marked this pull request as ready for review November 15, 2024 08:24

plaguss requested a review from dvsrepo November 15, 2024 11:51

plaguss added 5 commits November 18, 2024 06:30

Refactor ilm to image_generation

1745fd8

Add tests for openai image generation

943d922

Add base ImageGenerationModel classes to improve maintainability

40446f2

Add tests for inference endpoints

43964f7

Fix class names and types

0761894

plaguss added 7 commits November 18, 2024 09:23

Update docs with image generation models

b833e38

Update the distiset docs to include the new method

b9dd6d8

Update examples with the new behaviour

6f4846d

Create module for common operations on images

665a156

Update image generation task and distiset to transform the images bef…

e737d04

…ore pushing to the hub

Add tests for the distiset and image generation task

e08a90a

Define image generation models from zero

a84a2a3

plaguss added 13 commits November 19, 2024 09:14

Fixed openai tests mocking call to requests.get

7a601e1

Merge with develop

66658d8

Merge and fix conflict

26fe6e2

Make image_to_str more general

6a3d279

Create base ImageTask to deal with ImageGenerationModels

9c1ffc1

Replace with image_to_str function

aa6f9f5

Move import

94360f6

Fix MRO in class inheritance

10538ff

Fix example script

e98b307

Fix optional key not found in runtime parameters

8974ff0

Update examples and simplify process method

341622d

Add ImageTask to the API reference

c46fdc5

Update image task docs

e9e6790

davidberenstein1957 reviewed Dec 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Language Models and `ImageGeneration` task #1060

Image Language Models and `ImageGeneration` task #1060

plaguss commented Nov 14, 2024 •

edited

Loading

github-actions bot commented Nov 14, 2024

codspeed-hq bot commented Nov 14, 2024 •

edited

Loading

davidberenstein1957 commented Nov 18, 2024

davidberenstein1957 Nov 20, 2024

davidberenstein1957 Dec 19, 2024

	+ distiset = distiset.transform_columns_to_image("image")
	distiset = distiset.transform_columns_to_image("image")

Image Language Models and ImageGeneration task #1060

Are you sure you want to change the base?

Image Language Models and ImageGeneration task #1060

Conversation

plaguss commented Nov 14, 2024 • edited Loading

Description

github-actions bot commented Nov 14, 2024

codspeed-hq bot commented Nov 14, 2024 • edited Loading

CodSpeed Performance Report

Merging #1060 will not alter performance

Summary

davidberenstein1957 commented Nov 18, 2024

davidberenstein1957 Nov 20, 2024

Choose a reason for hiding this comment

davidberenstein1957 Dec 19, 2024

Choose a reason for hiding this comment

Image Language Models and `ImageGeneration` task #1060

Image Language Models and `ImageGeneration` task #1060

plaguss commented Nov 14, 2024 •

edited

Loading

codspeed-hq bot commented Nov 14, 2024 •

edited

Loading