-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST]About how to create nested make_tiled_copy #2048
Comments
Formatted, you have TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{},
make_layout(make_shape (Int<4>{}, (Int<8>{}, Int<4>{})),
make_stride(Int<4>{}, Int<1>{})), // Thr layout 4x8x4 m-major
make_layout(make_shape(Int<1>{}, Int<8>{}))); // Val layout 1x8 m-major which is an unfortunate C++ism as You need the extra TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{},
make_layout(make_shape (Int<4>{}, make_shape (Int<8>{}, Int< 4>{})),
make_stride(Int<8>{}, make_stride(Int<1>{}, Int<32>{}))), // Thr layout 4x8x4 k-major interleave
make_layout(make_shape(Int<1>{}, Int<8>{}))); // Val layout 1x8 k-major or more compactly: TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{},
Layout<Shape <_4,Shape <_8, _4>>,
Stride<_8,Stride<_1,_32>>>{}, // Thr layout 4x8x4 k-major interleave
Layout<Shape<_1,_8>>{}); // Val layout 1x8 k-major |
@cceckaThanks for your reply, I think I know what the problem is, but I would like to know under what circumstances commas are used as comma operators in cutlass, because I generally think commas are used as delimiters, such as the separation of variables. Secondly, why does the stride length you described look so strange? Because I want to express that the thread configuration of 484 is 32,4,1. Thank you very much. |
The comma operator is C++ and CUTLASS does not change it: You can set the strides to whatever you like, of course. |
Thanks for your help. I think I know the comma-operator. |
What is your question?
TiledCopy copyA = make_tiled_copy(Copy_Atom<UniversalCopy<uint128_t>, TA>{}, make_layout(make_shape(Int<4>{}, (Int<8>{}, Int<4>{})), make_stride(Int<4>{}, Int<1>{})), // Thr layout 4x8x4 m-major make_layout(make_shape(Int<1>{}, Int<8>{}))); // Val layout 1x8 m-major
I want to create threads in order of 484, but the copyA of print is size 16.
please help me
The text was updated successfully, but these errors were encountered: