Improvements to dash::accumulate #216

devreal · 2016-12-21T15:43:31Z

The current implementation of dash::accumulate has a flaw: only unit 0 (randomly chosen?) actually receives a valid result. This is neither documented nor does the user have a chance to influence this. I seem to remember that we had a similar discussion at our last F2F. IMO, either the user can choose which unit receives the result or it should be broadcast to all units. At the very least, it has to be documented...

Also, dash::accumulate should use dart_accumulate whenever possible.

Discovered while working on #212.

The text was updated successfully, but these errors were encountered:

fuchsto · 2016-12-21T19:50:42Z

The current realization of dash::accumulate is somewhat irritating, yes. Oh yes.

First things first:

dart_accumulate semantically corresponds to MPI_Accumulate: C[i] = reduce(A[i], B[i])
dash::accumulate semantically corresponds to std::accumulate: Acc = reduce([AccInit] v A[s...e])

Consequently, dart_accumulate is not a semantically suitable backend for dash::accumulate.
C++ and MPI differ in their definitions of "accumulate" (C++ uses the term correctly, however std::transform should be named std::map, but can't, because of the container. It's a bloody mess).

Semantics of dart_accumulate correspond to dash::transform (~ std::transform), and it is in fact used as its communication backend:

https://github.com/dash-project/dash/blob/development/dash/include/dash/algorithm/Transform.h#L252

The best I could do for this clash of specs was to document it:

Standardese aside, the current implementation is broken.
I found another defect: the initial value is accumulated in every local partial result, but should only be accumulated once, at unit 0. Currently implemented behavior is:

// (a, b] = { 1,2,3, 4,5,6 }
// --> accumulate(a,b,+,10) = 31

unit 0:                          | unit 1:
=================================+==================================
dash::accumulate(a, b, +, 10);   | dash::accumulate(a, b, +, 10);
--> local result: 10 + 1 + 2 + 3 | --> local result: 10 + 4 + 5 + 6
    = 16                         |     = 25
----------------------------- barrier ------------------------------
result = 16 + 25 = 41 != 31

devreal · 2016-12-22T09:21:41Z

Thanks for the enlightenment! Then maybe accumulate_impl should not reside in Accumulate.h since the semantics are different? Who owns the code?

fuchsto · 2016-12-22T09:29:31Z

Emm. You moved it there? Or @fmoessbauer did and you reviewed it:
71d57b7

I implemented the algorithm stuff and placed accumulate_impl in Transform.h
... but we all own the code!

devreal · 2016-12-22T09:30:58Z

I know. Let me rephrase: Who will fix it? ;)

fuchsto · 2016-12-22T09:31:53Z

Oh :D
I will, my original implementation is broken anyways.

fuchsto · 2016-12-22T09:32:54Z

... aaaand also: as it is defined in DASH, accumulate_impl should be named transform_impl, because that's the DASH semantics.

fuchsto · 2017-01-25T05:34:34Z

Resolved in PR #237

devreal · 2017-02-02T09:27:49Z

Reopening this issue after accidentally looking at the code for dash::transform: The way I understand the semantics of dash::transform is that it performs pair-wise operations on two input ranges into one output range. This is done using dart_accumulate, a wrapper around MPI_Accumulate. The latter does exactly what is says: it accumulates a single input range into the single output value. So, this is exactly what dash::accumulate does but is far from the semantics of dash::transform. Am I missing something here?

fuchsto · 2017-02-03T19:12:37Z

@devreal
MPI_Accumulate does not reduce a sequence to a single value, if I get the docs right:

https://www.mpich.org/static/docs/v3.1/www3/MPI_Accumulate.html

It performs an element-wise atomic update operation, similar to a put, but reduces origin and target data into the target buffer. This is why there is a parameter target_count. If the input sequence was reduced to a single value, it would always be 1 and therefore irrelevant.

So that's why dash::accumulate (DASH -> STL semantics) follows std::accumulate and dart_accumulate (DART -> MPI semantics) implements dash::transform / std::transform.

The synopsis described in these lecture slides might be more helpful than the vague definition in the MPI standard docs:

http://wgropp.cs.illinois.edu/courses/cs598-s16/lectures/lecture34.pdf

devreal · 2017-02-06T08:05:44Z

Right, got the docs wrong. Sorry for that.

devreal · 2017-02-13T09:58:23Z

I have to reopen this once again because the implementation of dash::accumulate is still broken, which was the actual reason for this issue. Please see my first paragraph in the initial description:

The current implementation of dash::accumulate has a flaw: only unit 0 (randomly chosen?) actually receives a valid result. This is neither documented nor does the user have a chance to influence this. I seem to remember that we had a similar discussion at our last F2F. IMO, either the user can choose which unit receives the result or it should be broadcast to all units. At the very least, it has to be documented...

I guess we got completely side-tracked by the discussion of the semantics of MPI_Accumulate etc but the original problem still persists.

EDIT: as a side-node, there is another problem: the definition of result as auto will lead to really interesting results when used with floating point values...

EDIT2: why not have dash::accumulate get the result to all units in the team and introduce dash::accumulate_at (or an additional unit paramter to dash::accumulate) which accumulates to a single unit. Also, dash::accumulate can make use of dart_allreduce and dart_reduce for basic types.

fuchsto · 2017-02-24T00:34:40Z

Will be fixed along with introduction of execution- and launch policies, see PR #300

devreal added enhancement module:algorithms labels Dec 21, 2016

fuchsto assigned fuchsto, fuerlinger, fmoessbauer, devreal and rkowalewski Dec 21, 2016

fuchsto closed this as completed Jan 25, 2017

devreal reopened this Feb 2, 2017

devreal closed this as completed Feb 6, 2017

devreal reopened this Feb 13, 2017

fuchsto mentioned this issue Feb 24, 2017

[WIP] Introducing execution- and launch policies #300

Open

fuchsto added this to the dash-0.3.0 milestone Feb 28, 2017

fuchsto mentioned this issue Mar 9, 2017

Introduce barriers to fix TransformTest #317

Merged

devreal mentioned this issue Oct 16, 2017

now Accumulate works for double too #440

Merged

devreal modified the milestones: dash-0.3.0, dash-0.4.0 Mar 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to dash::accumulate #216

Improvements to dash::accumulate #216

devreal commented Dec 21, 2016

fuchsto commented Dec 21, 2016 •

edited

Loading

devreal commented Dec 22, 2016

fuchsto commented Dec 22, 2016 •

edited

Loading

devreal commented Dec 22, 2016

fuchsto commented Dec 22, 2016

fuchsto commented Dec 22, 2016

fuchsto commented Jan 25, 2017

devreal commented Feb 2, 2017

fuchsto commented Feb 3, 2017 •

edited

Loading

devreal commented Feb 6, 2017

devreal commented Feb 13, 2017 •

edited

Loading

fuchsto commented Feb 24, 2017

Improvements to dash::accumulate #216

Improvements to dash::accumulate #216

Comments

devreal commented Dec 21, 2016

fuchsto commented Dec 21, 2016 • edited Loading

devreal commented Dec 22, 2016

fuchsto commented Dec 22, 2016 • edited Loading

devreal commented Dec 22, 2016

fuchsto commented Dec 22, 2016

fuchsto commented Dec 22, 2016

fuchsto commented Jan 25, 2017

devreal commented Feb 2, 2017

fuchsto commented Feb 3, 2017 • edited Loading

devreal commented Feb 6, 2017

devreal commented Feb 13, 2017 • edited Loading

fuchsto commented Feb 24, 2017

fuchsto commented Dec 21, 2016 •

edited

Loading

fuchsto commented Dec 22, 2016 •

edited

Loading

fuchsto commented Feb 3, 2017 •

edited

Loading

devreal commented Feb 13, 2017 •

edited

Loading