Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlexAttention? #1685

Open
johnnynunez opened this issue Feb 10, 2025 · 6 comments
Open

FlexAttention? #1685

johnnynunez opened this issue Feb 10, 2025 · 6 comments

Comments

@johnnynunez
Copy link

Is it compatible flexattention from pytorch 2.6.0?

@jainapurva
Copy link
Contributor

@drisspg

@drisspg
Copy link
Contributor

drisspg commented Feb 10, 2025

Can you add some more context here @johnnynunez

@johnnynunez
Copy link
Author

johnnynunez commented Feb 10, 2025

Can you add some more context here @johnnynunez

I want to quantize the lerobot pizero model, which has FlexAttention. @drisspg
https://huggingface.co/blog/pi0

context:
In the future, we plan on extending this support to allow for quantized versions of attention or things like RadixAttention as well.

https://pytorch.org/blog/flexattention/

@drisspg
Copy link
Contributor

drisspg commented Feb 10, 2025

So currently all of our quantization APIs target linear layers and are orthogonal to flex attention. Therefore, yes, flex attention should work. Flex-Tension currently doesn't support low precision inputs, however, that is planned - no ETA yet though

@johnnynunez
Copy link
Author

So currently all of our quantization APIs target linear layers and are orthogonal to flex attention. Therefore, yes, flex attention should work. Flex-Tension currently doesn't support low precision inputs, however, that is planned - no ETA yet though

thanks! I'm going to try

@drisspg
Copy link
Contributor

drisspg commented Feb 10, 2025

Let me know if anything comes up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants