You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Programmatic dependent launch (PDL) allows subsequent kernels to skip a device-wide synchronization (executed on the host) and instead synchronize on the device at a source location specified by the programmer. This can eliminate bubbles inside a stream when a kernel is finishing until the next kernel can ramp up.
We should evaluate where we can use this feature in CUB and then add it wherever it shows a benefit.
bernhardmgruber
changed the title
Use programmatic dependent launch in all CUB algorithms
[EPIC] Use programmatic dependent launch in all CUB algorithms
Dec 10, 2024
Programmatic dependent launch (PDL) allows subsequent kernels to skip a device-wide synchronization (executed on the host) and instead synchronize on the device at a source location specified by the programmer. This can eliminate bubbles inside a stream when a kernel is finishing until the next kernel can ramp up.
We should evaluate where we can use this feature in CUB and then add it wherever it shows a benefit.
The text was updated successfully, but these errors were encountered: