Chat with Llama 3.2-Vision (11B) Multimodal LLM

Overview

The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. Read more

Features

Visual recognition
Image reasoning
Captioning
Answering general questions about an image
Tool Calling

Installation

To get started with this project, follow these steps:

Clone the repository:

git clone https://github.com/bhimrazy/chat-with-llama-3.2-vision
cd chat-with-llama-3.2-vision

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Run server

export HF_TOKEN=your_huggingface_token # required for model download

python server.py

Run client/app

To test using python client, execute the following command:

What cocktail can I make with these ingredients?

python client.py --image=cocktail-ingredients.jpg --prompt="What cocktail can I make with these ingredients?"

To run the application, execute the following command:

streamlit run app.py

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
client.py		client.py
cocktail-ingredients.jpg		cocktail-ingredients.jpg
mountains.jpg		mountains.jpg
notebook.ipynb		notebook.ipynb
receipt.jpg		receipt.jpg
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with Llama 3.2-Vision (11B) Multimodal LLM

Overview

Features

Installation

Usage

About

Releases

Packages

Languages

License

bhimrazy/chat-with-llama-3.2-vision

Folders and files

Latest commit

History

Repository files navigation

Chat with Llama 3.2-Vision (11B) Multimodal LLM

Overview

Features

Installation

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages