Skip to content

Device-side grouped GEMM #1620

Closed Answered by thakkarV
osayamenja asked this question in Q&A
Jul 7, 2024 · 1 comments · 6 replies
Discussion options

You must be logged in to vote

yes, that example ultimately uses this collective: https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/gemm/collective/sm90_mma_array_tma_gmma_ss_warpspecialized.hpp

which you can use directly in your code on the device side.

Replies: 1 comment 6 replies

Comment options

You must be logged in to vote
6 replies
@jackkosaian
Comment options

@thakkarV
Comment options

thakkarV Jul 8, 2024
Collaborator

@osayamenja
Comment options

@thakkarV
Comment options

thakkarV Jul 9, 2024
Collaborator

Answer selected by osayamenja
@osayamenja
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants