[Question]: WarpSortConfig::PartitioningThreshold=3000, how this magic number choosen ? #606

ZJLi2013 · 2024-09-12T09:12:32Z

hi, rocm expert,

wonder how 3000 this magic number is considered here ?

when bench matrix shape as [m, n], if m <3000, then segmented_radix_sort_impl() will never go to do_paritioning, looks inside which has more fine-grained kernel depending on different segment_counts.

on the other hand, CUDA::CUB can do segmented per row, are we expecting some perf gap here?

Thanks for guiding

The text was updated successfully, but these errors were encountered:

Snektron · 2024-09-12T12:07:08Z

Hi, these values are determined by our autotuning system. We invoke this on a set of GPUs, which then compiles & benchmarks the algorithms for a range of parameters. A developer-oriented explanation is given here.

If you believe that there is a performance issue there, you have a few options:

You can pass a custom config for that particular operation where you manually set the values.
You can also add a benchmark case in the benchmark for segmented radix sort for your dimensions, and run the tuning yourself.

ppanchad-amd · 2024-11-25T19:51:31Z

Hi @ZJLi2013. Has your issue been resolved? If so, please close the ticket. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: WarpSortConfig::PartitioningThreshold=3000, how this magic number choosen ? #606

[Question]: WarpSortConfig::PartitioningThreshold=3000, how this magic number choosen ? #606

ZJLi2013 commented Sep 12, 2024 •

edited

Loading

Snektron commented Sep 12, 2024

ppanchad-amd commented Nov 25, 2024

[Question]: WarpSortConfig::PartitioningThreshold=3000, how this magic number choosen ? #606

[Question]: WarpSortConfig::PartitioningThreshold=3000, how this magic number choosen ? #606

Comments

ZJLi2013 commented Sep 12, 2024 • edited Loading

Snektron commented Sep 12, 2024

ppanchad-amd commented Nov 25, 2024

ZJLi2013 commented Sep 12, 2024 •

edited

Loading