Replies: 1 comment
-
A bank conflict does not simply cost 1 extra cycle, it is more than that. While not strictly multiplicative, accesses are serialized in cases of bank conflicts |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In the talk of cutlass,( https://developer.nvidia.com/gtc/2020/video/s21745) ,for load data from Gmem to smem, to avoid bank conflict ,use xor swizzle to put data in different position. But the whole porcession need 4 phases to finish , if we do not use the swizzle ,there will be 4way-bank conflict, also take 4 clock period,so what the advantage of swizzle ,thanks for you answer
Beta Was this translation helpful? Give feedback.
All reactions