-
Recently I want to integrate some cutlass operators into PyTorch, and I found in CUDAExtension we can pass some compile options. In other project it usually set: "--ptxas-options=-O2",
"--ptxas-options=-allow-expensive-optimizations=true", In my GEMM test, I found turn off these 2 options will get better performance, but there is no documentations about ptx. I just want to know what these options means, and hope cutlass develop team will give me some suggestions! :D |
Beta Was this translation helpful? Give feedback.
Answered by
hwu36
Jan 5, 2024
Replies: 1 comment
-
when you compile cutlass, just use cutlass's nvcc flags. that is what nvcc team uses to optimize cutlass. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
MARD1NO
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
when you compile cutlass, just use cutlass's nvcc flags. that is what nvcc team uses to optimize cutlass.