Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for merging teams in DASH #137

Open
fmoessbauer opened this issue Nov 20, 2016 · 5 comments
Open

Add support for merging teams in DASH #137

fmoessbauer opened this issue Nov 20, 2016 · 5 comments

Comments

@fmoessbauer
Copy link
Member

fmoessbauer commented Nov 20, 2016

Currently a team split is a non reversible operation. Together with the specification (or current implementation) that each unit can only be part of a single team hierarchy, team splits cannot be tested in the unit tests. Hence I set the bug label.

To fix this I see the following possibilities:

  • allow one unit in multiple teams (requires that the team hierarchy is not a single tree)
  • provide a merge or join operation

As far as I see, DART 3.2 already provides an interface therefor.

@fuerlinger
Copy link
Contributor

@fmoessbauer Which functionality would you like to test exactly? The team split is not reversible in the sense that the created team can not be destroyed explicitly, but on the other hand the original team also doesn't go away and can still be used.

I feel that team related operations can actually be very tricky to reason about and therefore I'm hesitant to include an operation such as a join or merge unless we come up with an actual use-case in the context of an application. As long as that compelling use case for merge/join does not exist I'm in favor of a usage model that is as restricted as possible. Currently that restricted model is: The only way to create a new team is to start with an existing team and build a subset.

As for multiple team hierarchies - yes, I think that is a viable model with cloned teams. Although this also requires some brainstorming about the consequences too...

@fmoessbauer
Copy link
Member Author

Currently each Team / sub team can only be split once. However in the unit tests we have multiple tests which split the dash::Team::All(). As all unit tests are executed in one DASH run, the ordering of the tests matters.The second testcase that splits a team will always fail.

However each test case should not depend on the env created by previous tests. The problem can easily be verified by running the following tests isolated.

  • Array*
  • TeamTest*
    Each testgroup itself passes, but if run together the following exception is thrown:
[    0 ERROR ] [  7065 ] Team.cc                  :62   | dash::exception::InvalidArgument             | [ Unit 0 ] child of team 0 already set to 1, cannot set to 3

Due to this behavior, the CI of PR #135 fails.

@fuerlinger
Copy link
Contributor

Here is an implementation sketch:

Implement a method clone() that clones the current team, i.e., creates a copy of a team on the C++ side for the same underlying DART team.

  • add a boolean variable _is_clone to differentiate the original from the clone(s).
  • actual de-allocation of DART resources should only happen in the destructor of originals. Cloned teams leave the DART resources alone.
  • implement a method clone() that clones the current team:
    • if the current team has a parent, call clone() on the parent first, let's call the new cloned parent tpar.
    • create a new team t, set the is_clone flag to true
    • set tpar's child pointer to t (if the tpar exists)
    • set t's parent pointer to the tpar (if the tpar exists)
    • in the originals we might then want to keep track of created clones
  • update the split() function:
    • when calling split() on a team that already has a child team, call clone() first and then invoke split() on newly created clone.

The recommended way of using teams in the unit tests (and elsewhere) would then be :

Team &t = dash::Team::All().clone();
team &split = t.split(2);

@fuchsto
Copy link
Member

fuchsto commented Nov 23, 2016

Also, from issue #140: All teams should be destroyed in dart_finalize.

@fuchsto fuchsto modified the milestones: dash-0.4.0, dash-0.3.0 Feb 22, 2017
@devreal
Copy link
Member

devreal commented May 2, 2019

Is this going anywhere?

@devreal devreal modified the milestones: dash-0.4.0, dash-0.5.0 May 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants