Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the 1-D depthwise conv still critical for RWKV? #4

Open
Chong-Chen-UNLV opened this issue Oct 21, 2022 · 2 comments
Open

Is the 1-D depthwise conv still critical for RWKV? #4

Chong-Chen-UNLV opened this issue Oct 21, 2022 · 2 comments

Comments

@Chong-Chen-UNLV
Copy link

Seems the RWKV-v4 didn't use the 1-d depthwise kernel you developed before. Is the 1-D depthwise CUDA kernel in this repo still a critical operator for RWKV?

I just want to check if I intend to contribute to this project, which CUDA kernel should I work on? Should I work on the codes in the 1-D depthwise folder or the codes in the WKV folders of this repo?

@BlinkDL
Copy link
Owner

BlinkDL commented Oct 22, 2022

The latest RWKV-4 is only using the WKV kernel :)

@Chong-Chen-UNLV
Copy link
Author

The latest RWKV-4 is only using the WKV kernel :)

I saw exponent function is used in the WKV kernel. The operator will cause many Flops, which means the WKV kernel has a big Flop/Byte value. I believe there is no need to optimize the efficiency of the WKV kernel because it can efficiently utilize the 100% float computing capability of GPU. Please let me know if I am wrong and if your test result indicates that the WKV kernel can't make use of the full power of the GPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants