Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve thread-safety in DART #252

Merged
merged 25 commits into from
Feb 13, 2017
Merged

Improve thread-safety in DART #252

merged 25 commits into from
Feb 13, 2017

Conversation

devreal
Copy link
Member

@devreal devreal commented Jan 25, 2017

Improve thread-safety of certain DART operations and provide an interface for initialization of a thread-safe runtime.

Collective allocation and communication are now marked as thread-safe given that they are not called on the same team concurrently.

Addresses #118

Copy link
Member

@fmoessbauer fmoessbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see any unit tests of the thread safety features. Our CI containers do have MPI libs with thread support enabled. For the sake of completeness I added the docker scripts to this branch.

@devreal
Copy link
Member Author

devreal commented Jan 26, 2017

I added three test-cases, most importantly testing concurrent memory allocation on different teams and simple concurrent put: https://github.com/dash-project/dash/blob/bug-118-threadsafety/dash/test/ThreadsafetyTest.cc

Feel free to shoot me more ideas you might have :)

#define DASH_DART_BASE_ATOMIC_H_


#define DART_FETCH_AND_ADD64(ptr, val) __sync_fetch_and_add((int64_t *)(ptr), (val))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These definitions need a guard checking for availability of the atomic compiler extensions, something in the line of HAVE_SYNC_FETCH_AND_ADD.

For a from-scratch implementation, see:

Also, please reformat the code for 80 columns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a guard flag and an unsafe alternative. CMake errors out if thread-support is requested but the builtins are not available.

I am aware of the EMBB code but I would like to avoid pulling in assembler code by all means because it is just not portable. Just think of moving to ARM... The __sync builtins are supported by all compilers we currently support. Ideally we would use the __atomic builtins or C11 atomics (at some point) but at least the latter is still not supported by the Intel compiler at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! We def' do not want to ship assembly (If we do some day: I'm actually way better at ARM asm than x86)

build.sh Outdated
@@ -59,6 +59,7 @@ rm -Rf $BUILD_DIR/*
-DDART_IF_VERSION=3.2 \
-DINSTALL_PREFIX=$HOME/opt/dash-0.3.0/ \
-DDART_IMPLEMENTATIONS=mpi \
-DENABLE_THREADING=ON \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please decide for either ENABLE_THREADING or ENABLE_MULTITHREADING.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up with ENABLE_THREADSUPPORT, which imo describes the semantics best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splendid!

* the underlying runtime.
*/
DART_THREAD_MULTIPLE = 10
} dart_thread_level_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Puns like "thread level" have my heartfelt support, but perhaps we should name it dart_thread_support_level_t?

>
using check_summa_pattern_constraints =
typename std::enable_if<
summa_pattern_constraints<MatrixTypeA>::satisfied::value &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be MatrixTypeA, MatrixTypeB, MatrixTypeC (see lines below)

@@ -121,6 +121,27 @@ typedef dash::pattern_layout_properties<
dash::pattern_layout_tag::linear
> summa_pattern_layout_constraints;

template<typename MatrixType>
using summa_pattern_constraints =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works for the current N x N implementation of SUMMA. The apparently redundant original definition served as a reminder that pattern constraints may depend on each other.
This simplification is okay for now, but it should not be in a public namespace. Please move it to dash::internal or dash::detail.

#define DART_FETCH_AND_INC32(ptr) __sync_fetch_and_add((int32_t *)(ptr), 1)
#define DART_FETCH_AND_INC16(ptr) __sync_fetch_and_add((int16_t *)(ptr), 1)
#define DART_FETCH_AND_INC8(ptr) __sync_fetch_and_add((int8_t *)(ptr), 1)
#define DART_FETCH_AND_INCPTR(ptr) __sync_fetch_and_add((void **)(ptr), sizeof(**ptr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizeof(**ptr) should be sizeof(**(ptr)), same in line 34.

@devreal
Copy link
Member Author

devreal commented Feb 8, 2017

@fuchsto I think I addressed all of your points. Please review and let me know if you have any other points :)

@fuchsto
Copy link
Member

fuchsto commented Feb 8, 2017

@devreal On it, right after lunch!

Copy link
Member

@fuchsto fuchsto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine for me, but unit tests should be way more comprehensive.

#define DART_DEC_AND_FETCH8(ptr) \
__sync_sub_and_fetch((int8_t *)(ptr), 1)
#define DART_DEC_AND_FETCHPTR(ptr) \
__sync_sub_and_fetch((void **)(ptr), sizeof(**ptr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be sizeof(**(ptr))
But don't bother, can fix this myself before merging.

arr_all.deallocate();
ASSERT_EQ_U(arr_all.size(), 0);
}
arr_all.allocate(team_all.size(), dash::DistributionSpec<1>(), team_all);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increasing the size of the dash::Array instance might improve the probability to catch synchronization bugs.

@@ -0,0 +1,131 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests of the introduced atomic operations (DART_COMPARE_AND_SWAP* etc.), right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, mainly because they are only used internally and not exposed to the outside world. It seems I cannot include dash/dart/base/atomic.h in a test file.

@devreal
Copy link
Member Author

devreal commented Feb 10, 2017

I will try to think of and create some more unit tests that could trigger errors.

@devreal
Copy link
Member Author

devreal commented Feb 13, 2017

I created some additional unit tests to cover dash::memalloc and DASH algorithms. Please review and merge if appropriate.

@fuchsto
Copy link
Member

fuchsto commented Feb 13, 2017

@devreal Splendid! Will merge as soon as CI is done.

@fuchsto
Copy link
Member

fuchsto commented Feb 13, 2017

Looks like the clang build in CI does not support OpenMP:

https://1203-45930061-gh.circle-artifacts.com/0/tmp/circle-junit.9IGEHUV/mpich/clang/Release/logs/build.log

Build system correctly reports the missing thread support:

-- Enable multithreading:    (ENABLE_THREADSUPPORT)          OFF

(While I'm at it: this line should be placed above, at the other ENABLE_ entries, and alignment of OFF is wrong)

This should be okay. The test case should be disabled if DASH / DART are built without OpenMP support, however.

@fuchsto fuchsto merged commit 9befaef into development Feb 13, 2017
@fuchsto fuchsto deleted the bug-118-threadsafety branch February 13, 2017 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants