Skip to content

Conversation

Sergei-Lebedev
Copy link
Contributor

What

CL HIER should report global status on team create.

Why ?

It's possible that selection table may be different on different ranks if rank considers local status only.
Internal issue: https://redmine.mellanox.com/issues/3336577

How ?

Do service team allreduce at the end of CL HIER team create to know global team status.

@vspetrov
Copy link
Collaborator

Probably it is worth moving the team lvl allreduce logic to the core: in the ucc_team_create_test in the very end (after all CLs are created). So, that it will always be just 1 "status exchange allreduce" in the end of the team creation. CLs statusus would be part of it. If at some point we will add more info to exchange (synchronize) upon team creation we will piggy-back it there as well. Currently, for example, maybe CL/BASIC also needs to synch which TLs are created. Then both CLs could do it in just 1 allreduce.

makes sense?

@swx-jenkins3
Copy link

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants