Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

Kokkos::fence

Christian Trott edited this page Feb 8, 2022 · 4 revisions

Kokkos::fence

Header File: Kokkos_Core.hpp

Usage:

Kokkos::fence();

Blocks on completion of all outstanding asynchronous Kokkos operations. That includes parallel dispatch (e.g. parallel_for, parallel_reduce and parallel_scan) as well as asynchronous data operations such as three-argument deep_copy.

Note: there is a execution space instance specific fence too: ExecutionSpaceConcept

Interface

void Kokkos::fence();
void Kokkos::fence(const std::string& label);

Parameters

  • label: A label to identify a specific fence in fence profiling operations. label does not have to be unique.

Requirements

  • Kokkos::fence() cannot be called inside an existing parallel region (i.e. inside the operator() of a functor or lambda).

Semantics

  • Blocks on completion of all outstanding asynchronous works. Side effects of outstanding work will be observable upon completion of the fence call - that means Kokkos::fence() implies a memory fence.

Examples

Timing kernels

Kokkos::Timer timer;
// This operation is asynchronous, without a fence 
// one would time only the launch overhead
Kokkos::parallel_for("Test", N, functor);
Kokkos::fence();
double time = timer.seconds();

Use with asynchronous deep copy

Kokkos::deep_copy(exec1, a,b);
Kokkos::deep_copy(exec2, a,b);
// do some stuff which doesn't touch a or b
Kokkos::parallel_for("Test", N, functor);

// wait for all three operations to finish
Kokkos::fence();

// do something with a and b

~

Clone this wiki locally