Skip to content

[cudax] Prototype reduce function that takes a group#7699

Open
davebayer wants to merge 2 commits intoNVIDIA:mainfrom
davebayer:hier_groups4
Open

[cudax] Prototype reduce function that takes a group#7699
davebayer wants to merge 2 commits intoNVIDIA:mainfrom
davebayer:hier_groups4

Conversation

@davebayer
Copy link
Contributor

@davebayer davebayer commented Feb 17, 2026

Partially implements #7580

@davebayer davebayer self-assigned this Feb 17, 2026
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 17, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@davebayer davebayer changed the title [cudax] Prototype reduce function that takes a group [cudax] Prototype reduce function that takes a group Feb 17, 2026
@github-project-automation github-project-automation bot moved this to Todo in CCCL Feb 17, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Feb 17, 2026
@davebayer davebayer marked this pull request as ready for review February 18, 2026 20:16
@davebayer davebayer requested a review from a team as a code owner February 18, 2026 20:16
@davebayer davebayer requested a review from caugonnet February 18, 2026 20:16
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Feb 18, 2026
explicit sum_t() = default;

// todo: would it be nice if this_thread_group would be a concept?
// We could do operator()(cudax::this_thread_group auto group, ...) in C++20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

__device__ void operator()(const Config& config)
{
// todo: I want to write:
// coop::requires_shared_memory(coop::sum, cudax::this_thread(config), coop::value_type<unsigned>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine. My concern is the potential instances explosion when we have reduce + predefined/user-defined operators

// The problem is that the group holds a reference to the hierarchy, but that's unavailable in constexpr
// context.
constexpr auto thread_required_smem =
coop::required_shared_memory<decltype(cudax::this_thread(config))>(coop::sum, coop::value_type<unsigned>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be this_thread_t better?

Comment on lines +31 to +35
template <class T>
struct shared_memory_scratch
{
T& ref;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side comment. I never liked TempStorage approach for shared mem because it is opaque to the user and conversion could fall in UB. void* + compile-time size would make more sense IMO.

@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 1h 04m: Pass: 91%/48 | Total: 10h 50m | Max: 30m 47s | Hits: 75%/23531

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants