-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can VisionZip be used on VLMs deployed on NPU? #6
Comments
Hi, To be honest, I am not very familiar with the NPU. However, VisionZip is designed to reduce redundant visual tokens before feeding them into the LLM. Therefore, we believe it can be applied and deployed alongside most LLM acceleration algorithms. As for deployment on the NPU, I think if the raw VLM can be deployed on the NPU, then the VisionZip could also be applicable. Best regards, |
Thanks for your reply! I need to convert the model's weight into ONNX format and then use an inference framework to inference on edge-side NPU. Should I use VisionZip during the post-processing stage after converting to ONNX, or should I use VisionZip before converting to ONNX? Looking forward to your response. |
Hi Boyu, I am not familiar with the NPU and ONNX, and I just quickly learned some related knowledge through ChatGPT. GPT-4O suggested that VisionZip should be applied before converting to ONNX. Below is the answer GPT provided. Please note that these answers may not be entirely correct! VisionZip should be applied before converting to ONNX. Reasoning:
Workflow:
This approach keeps everything streamlined and efficient for edge deployment. If you have any questions, please feel free to discuss them with me. Best regards, |
No description provided.
The text was updated successfully, but these errors were encountered: