-
Hi, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
Greetings and thanks for your interest in CUTLASS! I'll be happy to add this request to our task list. The next public release of CUTLASS will feature improvements in the documentation. |
Beta Was this translation helpful? Give feedback.
-
@zhujj2008 Specifically regarding the layout (20,2):(16:4) o (4,5):(1,4), it's helpful to notice that (4,5):(1,4) is the compact column-major layout for the shape (4,5). Thus, it's equivalent to 20:1.
20:1 means "take the first 20 consecutive elements of the layout." ("Consecutive" comes from the stride 1.) That's just the first column of (20,2):(16:4), that is, 20:16 (20 elements, with stride 16). The shape of a composition A o B is the shape of B. Thus, we need to reshape 20:16 to have the required shape (4,5). The stride between consecutive elements in a column is 16, and the stride between consecutive elements in a row is 4 * 16 = 64. Thus, the answer is (4,5) : (16,64). We get exactly this result if we use compile-time shapes and strides. The following code prints "(_4,_5):(_16,_64)." using namespace cute;
auto a = make_layout(make_shape(Int<20>{}, _2{}), make_stride(_16{}, _4{}));
auto b = make_layout(make_shape(_4{}, _5{}), make_stride( _1{}, _4{}));
auto c = composition(a, b);
printf("\n");
print(c); Results look different (but are the same mathematically) if we use run-time integers. The following prints "((4,1),(5,1)):((16,4),(64,4))." auto a = make_layout(make_shape(20,2), make_stride(16,4));
auto b = make_layout(make_shape(4, 5), make_stride(1, 4));
auto c = composition(a, b);
printf("\n");
print(c); ((4,1),(5,1)):((16,4),(64,4)) is effectively the same layout as (4,5) : (16,64), because the 1s in the shape don't affect the layout (as a mathematical function from one integer to one integer). CuTe chooses not to simplify layout computations with run-time values in them as much as it could, because simplifications involving run-time values have a run-time cost. |
Beta Was this translation helpful? Give feedback.
@zhujj2008 Specifically regarding the layout (20,2):(16:4) o (4,5):(1,4), it's helpful to notice that (4,5):(1,4) is the compact column-major layout for the shape (4,5). Thus, it's equivalent to 20:1.
20:1 means "take the first 20 consecutive elements of the layout." ("Consecutive" comes from the stride 1.) That's just the first column of (20,2):(16:4), that is, 20:16 (20 elements, with stride 16).
The shape of a composition A o B is the shape of B. Thus, we need to reshape 20:16 to have the required shape (4,5). The stride between consecutive elements in a column is 16, and the stride between consecutive elements in a row is 4 * 16 = 64.…