-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Copy of C Array to Dash Array across multiple units #347
Conversation
It was not intended to submit the clang-format in this pull request. But now we have one. |
@fuchsto My changes assume that local memory is always contiguous. So it will probably not work with patterns where local memory is fragmented. Please consider for review this special case. I am not sure if any of our copy algorithms can handle this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments inline.
dash/include/dash/algorithm/Copy.h
Outdated
auto out_last = out_first + num_elements; | ||
if (num_elements < 1) return out_first; | ||
|
||
auto nremaining = static_cast<dash::default_size_t>(num_elements); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dangerous! dash::default_size_t
may be 32 bit wide while num_elements
is of type size_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::distance
cannot return size_t
because it may be negative. And then I check explicitly for a non-negative number.
Apart from that I would expect that dash::default_size_t
is the same as size_t
if available on the underlying platform. Is there any argument against this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is ssize_t
. The case I had in mind is the following: the user might decides to compile with the flag DASH_ENABLE_DEFAULT_INDEX_TYPE_INT
(for whatever reason) and then copy more than 4GB into the array, which causes num_elements
to be truncated into nremaining
. He might even pass size_t
as template parameter IndexType
to the global data structure so it's a valid operation. I think using the pattern's index-type is the safest bet. But that is certainly a highly pathologic case :)
dash/include/dash/algorithm/Copy.h
Outdated
auto local_size = pattern.local_extents(team_unit_t{local_pos.unit}); | ||
|
||
auto ncopy = std::min( | ||
std::initializer_list<dash::default_size_t>{local_size[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
dash/include/dash/algorithm/Copy.h
Outdated
dart_storage_t ds = dash::dart_storage<ValueType>(ncopy); | ||
|
||
DASH_ASSERT_RETURNS( | ||
dart_put_blocking( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use dart_put
and a final dart_flush
after the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course you are right, thanks. I will do this. Did not think too much about this, I simply extended the already existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devreal I have considered your comment and modified it appropriately.
After having a more detailed look into it I noticed that copy_async
was not modified according to the changes in the blocking variant. The approach is now a bit more elegant in my opinion. dash::copy
on local ranges is now dash::copy_async
on local ranges followed by an immediate wait on the returned future.
As a consequence we do temporarily not support dash::GlobPtr
as input to dash::copy
because we need the pattern information in GlobIter
. This was a discussion in #246. However it is supported again after merging #345 which enables pointer arithmetics on dash::GlobPtr
across units.
dash/include/dash/algorithm/Copy.h
Outdated
auto out_last = out_first + num_elements; | ||
if (num_elements < 1) return out_first; | ||
|
||
auto nremaining = static_cast<dash::default_size_t>(num_elements); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::distance
cannot return size_t
because it may be negative. And then I check explicitly for a non-negative number.
Apart from that I would expect that dash::default_size_t
is the same as size_t
if available on the underlying platform. Is there any argument against this?
dash/include/dash/algorithm/Copy.h
Outdated
dart_storage_t ds = dash::dart_storage<ValueType>(ncopy); | ||
|
||
DASH_ASSERT_RETURNS( | ||
dart_put_blocking( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course you are right, thanks. I will do this. Did not think too much about this, I simply extended the already existing code.
Support for strided / chunked local ranges has been requested in issue #116. |
Ah, now I see your changes. No, the implementation of Some time ago I gave a talk on a DASH project meeting on In addition, the blocking algorithms in DASH should rely on specialized blocking operations provided by DART because these again can exploit the communication routines for this specific purpose - even if these are currently nearly identical. |
Addresses #346