[BUG] amd has updated thair fa2 fork quite some time ago #272

IMbackK · 2025-01-13T10:28:50Z

OS

Windows

GPU Library

CUDA 12.x

Python version

3.12

Describe the bug

tabbyAPI/backends/exllamav2/utils.py

Line 23 in bd16681

"(30 series) or newer. AMD GPUs are not supported."

is in appropriate as a blanket statement and condition, fa2 is up to date and works fine on amd gpus (CDNA only) with exllamav2

Reproduction steps

upstream https://github.com/Dao-AILab/flash-attention contains amd support by now

Expected behavior

AMD cdna gpus should be considered supported just as well as ampere+

Logs

No response

Additional context

No response

Acknowledgements

I have looked for similar issues before submitting this one.
I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will ask my questions politely.

DocShotgun · 2025-01-14T04:21:46Z

It's true that FA2 does have AMD ROCm support now (I've used it for model finetuning purposes), however there are several issues that will limit its usability:

There are no official wheels. While we could build wheels for ROCm, the last time I tried when I was working on training, the build time exceeded the maximum allowed runtime of 6 hours for the free GH actions runner. Perhaps there would be a way to optimize around this. Otherwise would need to be self-built by power users.
Only certain AMD GPUs are supported and there would need to be some kind of architecture check for this as well. It's not supported on consumer-grade AMD GPUs IIRC, which is rather limiting. For larger enterprise server type setups, there are other more ideal inference backends besides TabbyAPI/ExLlamaV2.

I think overall it would be a fairly niche use case for power users that probably would be too low yield to integrate into the automated TabbyAPI installation. Perhaps there could be a way to detect a working AMD+FA2 setup and not force compatibility mode.

IMbackK · 2025-01-14T08:48:53Z

Im not worried about automatic installing, but tabbyapi should not refuse to use fa2 just because its an amd gpu. I think on amd we should not try to install anything but if fa2 is installed on an amd system we should assume its the correct amd compiled version and it should be used. This could be achived by checking for hip, trying to import flash attn and then useing it if that works or choosing the compat mode if this results in a ModuleNotFoundError exception

IMbackK · 2025-01-14T08:53:55Z

Btw flash attn dose support rdna3, so there is support for at least some consumer gpus

bdashore3 · 2025-01-19T15:46:25Z

FA2 being supported on rocm is a big step forward for the AMD side of AI.

However, the important thing for FA2 and tabby is if there's paged attention support. This allows for use of the batching engine.

Iirc the rocm version has batching but I neither have an AMD card nor wheels to test.

Therefore, this will have to be a PRed feature with the goal of autodetection

IMbackK added the bug Something isn't working label Jan 13, 2025

bdashore3 added the help wanted Extra attention is needed label Jan 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] amd has updated thair fa2 fork quite some time ago #272

[BUG] amd has updated thair fa2 fork quite some time ago #272

IMbackK commented Jan 13, 2025

DocShotgun commented Jan 14, 2025

IMbackK commented Jan 14, 2025 •

edited

Loading

IMbackK commented Jan 14, 2025

bdashore3 commented Jan 19, 2025

[BUG] amd has updated thair fa2 fork quite some time ago #272

[BUG] amd has updated thair fa2 fork quite some time ago #272

Comments

IMbackK commented Jan 13, 2025

OS

GPU Library

Python version

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

DocShotgun commented Jan 14, 2025

IMbackK commented Jan 14, 2025 • edited Loading

IMbackK commented Jan 14, 2025

bdashore3 commented Jan 19, 2025

IMbackK commented Jan 14, 2025 •

edited

Loading