Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Refine barrier semantics #375

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 15 additions & 7 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1736,15 +1736,23 @@ passed an invalid event.
The new handler methods, and queue shortcuts, defined by
link:../supported/sycl_ext_oneapi_enqueue_barrier.asciidoc[sycl_ext_oneapi_enqueue_barrier]
can only be used in graph nodes created using the Record & Replay API, as
barriers rely on events to enforce dependencies. For barriers with an empty
wait list parameter, the semantics are that the barrier node being added to
will depend on all the existing graph leaf nodes, not only the leaf nodes
that were added from the queue being recorded.
barriers rely on events to enforce dependencies.

A synchronous exception will be thrown with error code `invalid` if a user
tries to add them to a graph using the Explicit API. Empty nodes created with
the `node::depends_on_all_leaves` property can be used instead of barriers when
a user is building a graph with the explicit API.
tries to add a barrier command to a graph using the explicit API. Empty nodes
created with the `node::depends_on_all_leaves` property can be used instead of
barriers when a user is building a graph with the explicit API.

The semantics of barriers are defined in `sycl_ext_oneapi_enqueue_barrier` for
a single command-queue, and correlate as follows to a graph that may contain
nodes that are recorded from multiple queues and/or added by the explicit API:

* Barriers with an empty wait list parameter will only depend on the leaf nodes
that were added to the graph from the queue the barrier command is being
recorded from.

* The only commands which have an implicit dependency on the barrier command
are those recorded from the same queue the barrier command was submitted to.

==== sycl_ext_oneapi_memcpy2d

Expand Down
20 changes: 7 additions & 13 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -353,9 +353,6 @@ graph_impl::add(const std::shared_ptr<graph_impl> &Impl,

const std::shared_ptr<node_impl> &NodeImpl = std::make_shared<node_impl>();

// Add any deps from the vector of extra dependencies
Deps.insert(Deps.end(), MExtraDependencies.begin(), MExtraDependencies.end());

MNodeStorage.push_back(NodeImpl);

addDepsToNode(NodeImpl, Deps);
Expand Down Expand Up @@ -488,20 +485,12 @@ graph_impl::add(node_type NodeType,
// list
Deps.insert(Deps.end(), UniqueDeps.begin(), UniqueDeps.end());

// Add any deps from the extra dependencies vector
Deps.insert(Deps.end(), MExtraDependencies.begin(), MExtraDependencies.end());

const std::shared_ptr<node_impl> &NodeImpl =
std::make_shared<node_impl>(NodeType, std::move(CommandGroup));
MNodeStorage.push_back(NodeImpl);

addDepsToNode(NodeImpl, Deps);

// Set barrier nodes as prerequisites (new start points) for subsequent nodes
if (NodeImpl->MCGType == sycl::detail::CG::Barrier) {
MExtraDependencies.push_back(NodeImpl);
}

return NodeImpl;
}

Expand Down Expand Up @@ -610,12 +599,17 @@ void graph_impl::makeEdge(std::shared_ptr<node_impl> Src,
removeRoot(Dest); // remove receiver from root node list
}

std::vector<sycl::detail::EventImplPtr> graph_impl::getExitNodesEvents() {
std::vector<sycl::detail::EventImplPtr> graph_impl::getExitNodesEvents(
std::weak_ptr<sycl::detail::queue_impl> RecordedQueue) {
std::vector<sycl::detail::EventImplPtr> Events;

auto RecordedQueueSP = RecordedQueue.lock();
for (auto &Node : MNodeStorage) {
if (Node->MSuccessors.empty()) {
Events.push_back(getEventForNode(Node));
auto EventForNode = getEventForNode(Node);
if (EventForNode->getSubmittedQueue() == RecordedQueueSP) {
Events.push_back(getEventForNode(Node));
}
}
}

Expand Down
46 changes: 23 additions & 23 deletions sycl/source/detail/graph_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1184,26 +1184,26 @@ class graph_impl {
size_t getNumberOfNodes() const { return MNodeStorage.size(); }

/// Traverse the graph recursively to get the events associated with the
/// output nodes of this graph.
/// output nodes of this graph associated with a specific queue.
/// @param[in] Queue The queue exit nodes must have been recorded from.
/// @return vector of events associated to exit nodes.
std::vector<sycl::detail::EventImplPtr> getExitNodesEvents();

/// Removes all Barrier nodes from the list of extra dependencies
/// MExtraDependencies.
/// @return vector of events associated to previous barrier nodes.
std::vector<sycl::detail::EventImplPtr>
removeBarriersFromExtraDependencies() {
std::vector<sycl::detail::EventImplPtr> Events;
for (auto It = MExtraDependencies.begin();
It != MExtraDependencies.end();) {
if ((*It)->MCGType == sycl::detail::CG::Barrier) {
Events.push_back(getEventForNode(*It));
It = MExtraDependencies.erase(It);
} else {
++It;
}
}
return Events;
getExitNodesEvents(std::weak_ptr<sycl::detail::queue_impl> Queue);

/// Store the last barrier node that was submitted to the queue.
/// @param[in] Queue The queue the barrier was recorded from.
/// @param[in] BarrierNodeImpl The created barrier node.
void setBarrierDep(std::weak_ptr<sycl::detail::queue_impl> Queue,
std::shared_ptr<node_impl> BarrierNodeImpl) {
MBarrierDependencyMap[Queue] = BarrierNodeImpl;
}

/// Get the last barrier node that was submitted to the queue.
/// @param[in] Queue The queue to find the last barrier node of. An empty
/// shared_ptr is returned if no barrier node has been recorded to the queue.
std::shared_ptr<node_impl>
getBarrierDep(std::weak_ptr<sycl::detail::queue_impl> Queue) {
return MBarrierDependencyMap[Queue];
}

private:
Expand Down Expand Up @@ -1281,11 +1281,11 @@ class graph_impl {
/// presence of the assume_buffer_outlives_graph property.
bool MAllowBuffers = false;

/// List of nodes that must be added as extra dependencies to new nodes when
/// added to this graph.
/// This list is mainly used by barrier nodes which must be considered
/// as predecessors for all nodes subsequently added to the graph.
std::list<std::shared_ptr<node_impl>> MExtraDependencies;
/// Mapping from queues to barrier nodes. For each queue the last barrier
/// node recorded to the graph from the queue is stored.
std::map<std::weak_ptr<sycl::detail::queue_impl>, std::shared_ptr<node_impl>,
fabiomestre marked this conversation as resolved.
Show resolved Hide resolved
std::owner_less<std::weak_ptr<sycl::detail::queue_impl>>>
MBarrierDependencyMap;
};

/// Class representing the implementation of command_graph<executable>.
Expand Down
26 changes: 12 additions & 14 deletions sycl/source/handler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -403,19 +403,6 @@ event handler::finalize() {
case detail::CG::Barrier:
case detail::CG::BarrierWaitlist: {
if (auto GraphImpl = getCommandGraph(); GraphImpl != nullptr) {
// if no event to wait for was specified, we add all exit
// nodes/events of the graph
if (MEventsWaitWithBarrier.size() == 0) {
MEventsWaitWithBarrier = GraphImpl->getExitNodesEvents();
// Graph-wide barriers take precedence over previous one.
// We therefore remove the previous ones from ExtraDependencies list.
// The current barrier is then added to this list in the graph_impl.
std::vector<detail::EventImplPtr> EventsBarriers =
GraphImpl->removeBarriersFromExtraDependencies();
MEventsWaitWithBarrier.insert(std::end(MEventsWaitWithBarrier),
std::begin(EventsBarriers),
std::end(EventsBarriers));
}
CGData.MEvents.insert(std::end(CGData.MEvents),
std::begin(MEventsWaitWithBarrier),
std::end(MEventsWaitWithBarrier));
Expand Down Expand Up @@ -533,6 +520,7 @@ event handler::finalize() {
// it to the graph to create a node, rather than submit it to the scheduler.
if (auto GraphImpl = MQueue->getCommandGraph(); GraphImpl) {
auto EventImpl = std::make_shared<detail::event_impl>();
EventImpl->setSubmittedQueue(MQueue);
std::shared_ptr<ext::oneapi::experimental::detail::node_impl> NodeImpl =
nullptr;

Expand Down Expand Up @@ -564,7 +552,17 @@ event handler::finalize() {
// queue.
GraphImpl->setLastInorderNode(MQueue, NodeImpl);
} else {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup));
auto LastBarrierRecordedFromQueue = GraphImpl->getBarrierDep(MQueue);
if (LastBarrierRecordedFromQueue) {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup),
{LastBarrierRecordedFromQueue});
} else {
NodeImpl = GraphImpl->add(NodeType, std::move(CommandGroup));
}

if (NodeImpl->MCGType == sycl::detail::CG::Barrier) {
GraphImpl->setBarrierDep(MQueue, NodeImpl);
}
}

// Associate an event with this new node and return the event.
Expand Down
Loading
Loading