Replies: 3 comments 1 reply
-
what do you mean? are you setting the launch bound? threads per threadblock is determined by warp tile size and threadblock tile size. grid size is determined by problem size and threadblock tile size. launch bound is a hint to the compiler, it has nothing to do with the tiling. |
Beta Was this translation helpful? Give feedback.
-
I want to make some SMs idle, for that reason I need to decrease the grid size to a value lower than the total number of SMs. |
Beta Was this translation helpful? Give feedback.
-
Does cutlas gemm provides configurable grid size and threadblock tile size? |
Beta Was this translation helpful? Give feedback.
-
Hi,
I need to change the Grid size & threads Per Block for the GEMM kernel together. By changing the configuration I am not able to do it.
For example, setting the Grid size to 8 and threads Per Block to 32. Then I think the GEMM should decompose the whole operation in many small GEMMs.
Beta Was this translation helpful? Give feedback.
All reactions