FlexAttention? #1685

johnnynunez · 2025-02-10T01:27:00Z

Is it compatible flexattention from pytorch 2.6.0?

jainapurva · 2025-02-10T02:02:56Z

@drisspg

drisspg · 2025-02-10T17:41:11Z

Can you add some more context here @johnnynunez

johnnynunez · 2025-02-10T19:34:51Z

Can you add some more context here @johnnynunez

I want to quantize the lerobot pizero model, which has FlexAttention. @drisspg
https://huggingface.co/blog/pi0

context:
In the future, we plan on extending this support to allow for quantized versions of attention or things like RadixAttention as well.

https://pytorch.org/blog/flexattention/

drisspg · 2025-02-10T20:23:31Z

So currently all of our quantization APIs target linear layers and are orthogonal to flex attention. Therefore, yes, flex attention should work. Flex-Tension currently doesn't support low precision inputs, however, that is planned - no ETA yet though

johnnynunez · 2025-02-10T20:26:13Z

So currently all of our quantization APIs target linear layers and are orthogonal to flex attention. Therefore, yes, flex attention should work. Flex-Tension currently doesn't support low precision inputs, however, that is planned - no ETA yet though

thanks! I'm going to try

drisspg · 2025-02-10T20:43:41Z

Let me know if anything comes up!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FlexAttention? #1685

FlexAttention? #1685

johnnynunez commented Feb 10, 2025

jainapurva commented Feb 10, 2025

drisspg commented Feb 10, 2025

johnnynunez commented Feb 10, 2025 •

edited

Loading

drisspg commented Feb 10, 2025

johnnynunez commented Feb 10, 2025

drisspg commented Feb 10, 2025

FlexAttention? #1685

FlexAttention? #1685

Comments

johnnynunez commented Feb 10, 2025

jainapurva commented Feb 10, 2025

drisspg commented Feb 10, 2025

johnnynunez commented Feb 10, 2025 • edited Loading

drisspg commented Feb 10, 2025

johnnynunez commented Feb 10, 2025

drisspg commented Feb 10, 2025

johnnynunez commented Feb 10, 2025 •

edited

Loading