Add transformers vision cookbook with atomic caption flow #1216

fearnworks · 2024-10-20T15:30:20Z

Request received in discord to add an example for the new transformers vision capability.

Vision-Language Models with Outlines

This guide demonstrates how to use Outlines with vision-language models, leveraging the new transformers_vision module. Vision-language models can process both text and images, allowing for tasks like image captioning, visual question answering, and more.

We will be using the Pixtral-12B model from Mistral to take advantage of some of its visual reasoning capabilities and a workflow to generate a multistage atomic caption.

Signed-off-by: jphillips <[email protected]>

rlouf · 2024-10-20T15:51:01Z

It's awesome! We'll need to link to it from mkdocs.yml and from the cookbooks' index page :)

Signed-off-by: jphillips <[email protected]>

fearnworks · 2024-10-20T15:59:39Z

It's awesome! We'll need to link to it from mkdocs.yml and from the cookbooks' index page :)

Updated!

rlouf · 2024-10-21T10:59:42Z

Thank you so much for your contribution!

transformers vision cookbook with atomic caption flow

936334a

Signed-off-by: jphillips <[email protected]>

Fix style lint, add to indexes

1b8604b

Signed-off-by: jphillips <[email protected]>

fix precommit errors

fc987f9

rlouf changed the title ~~transformers vision cookbook with atomic caption flow~~ Add transformers vision cookbook with atomic caption flow Oct 21, 2024

rlouf merged commit a2fa1e0 into dottxt-ai:main Oct 21, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add transformers vision cookbook with atomic caption flow #1216

Add transformers vision cookbook with atomic caption flow #1216

fearnworks commented Oct 20, 2024

rlouf commented Oct 20, 2024

fearnworks commented Oct 20, 2024 •

edited

Loading

rlouf commented Oct 21, 2024

Add transformers vision cookbook with atomic caption flow #1216

Add transformers vision cookbook with atomic caption flow #1216

Conversation

fearnworks commented Oct 20, 2024

Vision-Language Models with Outlines

rlouf commented Oct 20, 2024

fearnworks commented Oct 20, 2024 • edited Loading

rlouf commented Oct 21, 2024

fearnworks commented Oct 20, 2024 •

edited

Loading