Skip to content

Simplest gemm example with 3.x APIs #1742

Answered by thakkarV
dukallis asked this question in Q&A
Discussion options

You must be logged in to vote

Generally, I am interested whether it's possible to construct sgemm or convolution using new 3.x Collective, Kernel and Device APIs provided that I have underlying CuTe atoms specified correctly and then applied make_tiled_mma and make_tiled_copy to them?

Yes. Please see https://github.com/NVIDIA/cutlass/blob/main/test/unit/gemm/device/default_gemm_configuration.hpp for inspiration. A similar template config can be used for Volta/Turing and they should just work OOTB. We have some of these kernels internally that maybe @ccecka and I can work on upstreaming as single file examples in the future

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@dukallis
Comment options

@WhoisZihan
Comment options

@thakkarV
Comment options

thakkarV Sep 4, 2024
Collaborator

@thakkarV
Comment options

thakkarV Sep 4, 2024
Collaborator

Answer selected by dukallis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants