What is a good layout for problem with dimensions N = image size (512 * 512 or 1024 * 1024, for example), M = 64, K = 4? #332
-
Hi, I have a GEMM problem with dimensions N = image size (512 * 512 or 1024 * 1024, for example), M = 64, K = 4 and I'm trying to setup Cutlass to work on it (aiming for Turing architecture). I cannot find a layout that works with that small K = 4 in mixed precision (results in mem. misaligment). To provide more context, M is the size of a multilayer perceptron hidden layer and K is the size of the input layer. My goal is to make inference on each pixel of the image. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
what is your data type? You can use cutlass profiler to try all the possible ones and pick the best one. https://github.com/NVIDIA/cutlass/blob/master/media/docs/profiler.md |
Beta Was this translation helpful? Give feedback.
what is your data type? You can use cutlass profiler to try all the possible ones and pick the best one.
https://github.com/NVIDIA/cutlass/blob/master/media/docs/profiler.md