Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Bitnet support ROCm testing? #281

Open
ym-guan opened this issue Jan 14, 2025 · 4 comments
Open

Does Bitnet support ROCm testing? #281

ym-guan opened this issue Jan 14, 2025 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@ym-guan
Copy link

ym-guan commented Jan 14, 2025

Hi Lei, I'm trying to reproduce bitnet result with Mi308, But I meet some errors.

I use main branch to run and error like this

Image

I'm trying to modify the code to support ROCm like this:

Image

unfortunately I failed, it told me

Image

I guess there is no scheduler func for Mi308 in backend tl.

So, I checkout branch "amd_hip", run , and I found error

Image

In this branch, which use backend tir, there is a lot differences between branch "amd_hip" and "main". I have two Question: Does backend tl support ROCm? Does Bitnet support testing in ROCm?

Please tell me how to test bitnet in ROCm, thx!

My system

ubuntu==20.04
rocm==6.3.1
torch==2.5.1+rocm6.2
GPU==Mi308

@LeiWang1999
Copy link
Contributor

@ym-guan Thank you for reporting this. We’ve recently migrated the backend from tensorir to tilelang. While we do plan to support ROCm and CDNA with matrix code and already have implementations, it isn’t migrated yet. I’ll work on adding this as soon as possible, though I am currently juggling two upcoming deadlines, I'll take a look on this next week.

@LeiWang1999 LeiWang1999 self-assigned this Jan 14, 2025
@LeiWang1999 LeiWang1999 added the enhancement New feature or request label Jan 14, 2025
@LeiWang1999
Copy link
Contributor

LeiWang1999 commented Jan 14, 2025

@ym-guan Btw, I noticed that you’ve made some great modifications to support ROCm—awesome work! It looks like the bug might be in this section, where the kernel builder is dispatched to CUDA instead of HIP in certain places.

Feel free to ask any questions :)

@ym-guan
Copy link
Author

ym-guan commented Jan 14, 2025

Thank you @LeiWang1999 ! I follow your instruction and modified the code more. Actually, I support the matrix code https://github.com/microsoft/BitBLAS/blob/main/testing/python/amd/test_backend_hip_wrapper_matmul.py#L54, However, it doesn't support "matmul_backend_code_wrap(1, 256, 256, "float16", "float16", "uint4", "float16", False)"

Image

This is because the scheduler_ir_module cannot be tl.lower() correctly https://github.com/microsoft/BitBLAS/blob/main/bitblas/ops/operator.py#L195, it told me

Image

I guess the MatmulDequantizeScheduler cannot be transfer into hip correctly. If it can be, please correct me.

In a word, when I test matmul with config like int8xint2, I got the error

The error message: '/tl/tl_templates/cuda/common.h:3:10: fatal error: 'cuda_runtime.h' file not found

But it should be compile through hip, how?

Thanks for your reply!

@LeiWang1999
Copy link
Contributor

pr #282 made a simple fix, but only support target inference and consistent matmul (A and W are in the same datatypes), with dequantize may take few days to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants