Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libraries getting a sub-communicator cannot reliably use yogrt #14

Open
ofaaland opened this issue Jul 19, 2023 · 1 comment
Open

libraries getting a sub-communicator cannot reliably use yogrt #14

ofaaland opened this issue Jul 19, 2023 · 1 comment

Comments

@ofaaland
Copy link
Contributor

Tom writes:

When writing a library rather than an application, we only get a sub-communicator, so we don't know if we have rank 0 of MPI_COMM_WORLD or not. If we don't, we can't get the time remaining. If the restriction is to avoid n simultaneous queries to the scheduler, perhaps a request from rank 0 of any subcommunicator could get the right value.

The reasons for the current restriction that only rank 0 can call yogrt_remaining() and get the remaining time are described in yogrt_remaining(3).

Since not all resource managers have the same limitations on their ability to handle such queries, and the other concerns could be managed by library or application authors, it may be reasonable to relax this restriction for library authors.

@tepperly
Copy link
Member

In the context of a library that may have a sub-communicator, it seems like one must do something like the following:

   const int remaining(yogrt_remaining());
   int maxRemaining;
   CHECK_MPI(MPI_Allreduce(&remaining, &maxRemaining, 1, MPI_INT, MPI_MAX, myComm));
   if (maxRemaining < 0) { maxRemaining = ::std::numeric_limits<int>::max(); }

This will work if any of the ranks in myComm happen to be rank 0 of MPI_COMM_WORLD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants