Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch policies for algorithms #272

Open
devreal opened this issue Feb 10, 2017 · 6 comments
Open

Launch policies for algorithms #272

devreal opened this issue Feb 10, 2017 · 6 comments

Comments

@devreal
Copy link
Member

devreal commented Feb 10, 2017

At the moment, accesses to global memory of DASH containers through subscript operators are serialized using blocking put operations (dart_put_blocking) in GlobRef, potentially resulting in poor performance for naive users. I understand that this is useful to avoid synchronization problems in the general case, but allowing asynchronous access through subscript operators would provide an easy way of improving performance of remote access to multiple elements while using the same (simple to use) interface.

Hence, we could introduce asynchronous epochs, which have an explicit begin and end, e.g.,

dash::Array<int> arr(100);
arr.start_epoch();
if (dash::myid() == 0) {
  for (int i = 0; i < arr.size(); ++i) {
    arr[i] = i; // not immediately visible
  }
}
arr.end_epoch(); 
// all changes visible now

Within an epoch, GlobRef will issue non-blocking put (+get?) operations and data transfers are guaranteed to complete only after the end of the epoch.

Note: I am aware of dash::put_value_async in OneSided but I think it's not intuitively usable and quite wordy, e.g.,

dash::Array<int> arr(100);
if (dash::myid() == 0) {
  for (int i = 0; i < arr.size(); ++i) {
    dash::put_value_async(i, arr[i]); // is this correct?
  }
}
dash::fence(arr[0]); // sufficient to fence on the first element?
// all changes visible now

The documentation is not clear on the use of these methods and there are no tests/examples either. I also cannot efficiently mix dash::put_value_async and operator[] because the latter causes a flush. The idea of asynchronous epochs closely resembles the model employed by MPI-RMA (MPI_Put and MPI_Flush), just with a nicer interface :)

Please let me know what you think. Maybe there have been discussions about it earlier?

@devreal devreal added this to the dash-0.4.0 milestone Feb 10, 2017
@fuerlinger
Copy link
Contributor

Something like that is definitely needed and there was previous discussion about it. Some parts of the functionality described below are in DASH already but I think its not 100% working at the moment.

The idea to provide similar functionality as you describe with an .async modifier, like so:

for (int i = 0; i < arr.size(); ++i) {
  arr.async[i] = i; // not immediately visible
}
arr.flush(); // or arr.fence();

While arr[i] returns a dash::GlobRef arr.async[i] returns a dash::AsyncGlobRef and that would cause put and get operations without flush.

@rkowalewski
Copy link

We have GlobAsyncRef which should satisfy all the requirements

@fuchsto
Copy link
Member

fuchsto commented Feb 10, 2017

I introduced the GlobAsync* types quite some time ago and they do exactly that. We don't want to use the low-level function interface dash::put_value*, of course.
I'm about to define launch policies, though, so these won't be needed much longer. Otherwise we would need Glob x (Local*, Async* × Deferred* ...) x (Ptr, Iter, Ref).
But the container concept persists:

array.async[i] = value

and

value = array.async[i]; value.wait()

Another related concept is dash::Future.

@fuchsto
Copy link
Member

fuchsto commented Feb 10, 2017

... and the same launch policies will also be used for algorithms.
So the current interface

dash::copy_async(first, final)

will be changed to

dash::copy(dash::launch::async, first, final)

There's not much to it, just mimicking the C++11 concepts.
I already presented the new style at HPCC'16, thinking of it ... will care about this during the semester break.

@devreal
Copy link
Member Author

devreal commented Feb 10, 2017

Thanks for the clarifications, I wasn't aware that there were already activities in that direction in the DASH part. Looking forward to seeing this happen and leaving this ticket open to track progress.

@fuchsto fuchsto changed the title RFC: Asynchronous Epochs for DASH Containers Launch policies for algorithms Feb 13, 2017
@fuchsto
Copy link
Member

fuchsto commented Feb 27, 2017

For the latest proposal on execution and launch policies, see my comment in #300

#300 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants