Great Work! Ask for the Demo #3

BigDreamer11111 · 2024-12-09T16:12:00Z

I’m Denny, a PhD student with a strong interest in Vision-Language Models. I found your demo particularly fascinating, especially its ability to interactively select visual tokens. I believe this feature could be incredibly useful for assessing the importance of different tokens.

Would you be able to share the code for the demo or provide guidance on how to develop a similar implementation?

Thank you in advance for your time and assistance!

Best regards,
Denny

Yangsenqiao · 2024-12-10T16:16:20Z

Hi Denny,

Thank you for your interest in our work! Since this demo involves some modifications to the LLaVA code itself, it may not be as straightforward or concise to release as VisionZip. Please kindly allow me some time to consider an appropriate release method.

Best,
Senqiao

Yangsenqiao mentioned this issue Dec 10, 2024

Release VisionZip model checkpoints on Hugging Face #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Great Work! Ask for the Demo #3

Great Work! Ask for the Demo #3

BigDreamer11111 commented Dec 9, 2024

Yangsenqiao commented Dec 10, 2024

Great Work! Ask for the Demo #3

Great Work! Ask for the Demo #3

Comments

BigDreamer11111 commented Dec 9, 2024

Yangsenqiao commented Dec 10, 2024