You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seems the RWKV-v4 didn't use the 1-d depthwise kernel you developed before. Is the 1-D depthwise CUDA kernel in this repo still a critical operator for RWKV?
I just want to check if I intend to contribute to this project, which CUDA kernel should I work on? Should I work on the codes in the 1-D depthwise folder or the codes in the WKV folders of this repo?
The text was updated successfully, but these errors were encountered:
I saw exponent function is used in the WKV kernel. The operator will cause many Flops, which means the WKV kernel has a big Flop/Byte value. I believe there is no need to optimize the efficiency of the WKV kernel because it can efficiently utilize the 100% float computing capability of GPU. Please let me know if I am wrong and if your test result indicates that the WKV kernel can't make use of the full power of the GPU
Seems the RWKV-v4 didn't use the 1-d depthwise kernel you developed before. Is the 1-D depthwise CUDA kernel in this repo still a critical operator for RWKV?
I just want to check if I intend to contribute to this project, which CUDA kernel should I work on? Should I work on the codes in the 1-D depthwise folder or the codes in the WKV folders of this repo?
The text was updated successfully, but these errors were encountered: