Llama3.2-vision and Qwen2-VL Support #4

boom-bang · 2024-12-09T16:16:53Z

Amazing and impressive work. Any plans for the support of new generation multimodal LLMs such as Llama-3.2-11B-Vision or Qwen2-VL?

Yangsenqiao · 2024-12-10T16:30:52Z

Thank you for your interest in our work! I appreciate your suggestions and will attempt to use VisionZip on these models and explore its performance in the future. Additionally, we look forward to community pull requests🔥.

effortprogrammer · 2024-12-12T04:52:45Z

@Yangsenqiao Do you have any plannings to implement visionzip in huggingface transformers environment? Current implementation is related to the original LLava github repo, so it may be useful for other people to use in huggingface transformers environment.

Yangsenqiao · 2024-12-12T11:44:15Z

Thank you for your recommendation! We plan to provide a version compatible with Hugging Face in the future, but it may not be in the next few weeks as my final exams are approaching (Ｔ▽Ｔ).

effortprogrammer · 2024-12-12T11:50:17Z

@Yangsenqiao Let me know when you start working.. I will do my best to support as much as possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama3.2-vision and Qwen2-VL Support #4

Llama3.2-vision and Qwen2-VL Support #4

boom-bang commented Dec 9, 2024

Yangsenqiao commented Dec 10, 2024

effortprogrammer commented Dec 12, 2024 •

edited

Loading

Yangsenqiao commented Dec 12, 2024

effortprogrammer commented Dec 12, 2024

Llama3.2-vision and Qwen2-VL Support #4

Llama3.2-vision and Qwen2-VL Support #4

Comments

boom-bang commented Dec 9, 2024

Yangsenqiao commented Dec 10, 2024

effortprogrammer commented Dec 12, 2024 • edited Loading

Yangsenqiao commented Dec 12, 2024

effortprogrammer commented Dec 12, 2024

effortprogrammer commented Dec 12, 2024 •

edited

Loading