-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FlexAttention? #1685
Comments
Can you add some more context here @johnnynunez |
I want to quantize the lerobot pizero model, which has FlexAttention. @drisspg context: |
So currently all of our quantization APIs target linear layers and are orthogonal to flex attention. Therefore, yes, flex attention should work. Flex-Tension currently doesn't support low precision inputs, however, that is planned - no ETA yet though |
thanks! I'm going to try |
Let me know if anything comes up! |
Is it compatible flexattention from pytorch 2.6.0?
The text was updated successfully, but these errors were encountered: