Skip to content

【QST】Some questions about example 13 two tensor op fusion. #1282

Answered by hwu36
tianyan01 asked this question in Q&A
Discussion options

You must be logged in to vote

if problem size N is larger than threadblock size N, different threadblocks need to communicate with each other, that is why it is not supported.

you need to do splitN to relax this restriction just like what is discussed in the 2nd half of https://www.nvidia.com/en-us/on-demand/session/gtcspring22-s41606/ @lb1244206405

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by tianyan01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants