[SYCL][Graph] Add specification for kernel binary updates

Adds the kernel binary update feature to the sycl graph specification. This introduces a new dynamic_command_group class which can be used to update the command-group function of a kernel nodes in graphs.
reble · Aug 1, 2024 · 4637510 · 4637510
1 parent 20bcfea
commit 4637510
Showing 1 changed file with 273 additions and 22 deletions.
diff --git a/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc b/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
@@ -556,6 +556,114 @@ Parameters:
 
 |===
 
+==== Dynamic Command Groups
+
+[source,c++]
+----
+namespace ext::oneapi::experimental {
+class dynamic_command_group {
+public:
+  dynamic_command_group(
+      command_graph<graph_state::modifiable> graph,
+      const std::vector<std::function<void(handler &)>>& cgfList);
+
+  size_t get_active_cgf();
+  void set_active_cgf(size_t cgfIndex);
+};
+----
+
+Dynamic command-groups can be added as nodes to a graph. They provide a mechanism that
+allows updating the command-group function of a node after the graph is finalized.
+There is always one command-group function in the dynamic command-group that is set
+as active. When a dynamic command-group node is executed, the kernel of the active
+command-group function will be run and all the other command-group functions in
+`cgfList` will be ignored.
+
+See <<executable-graph-update, Executable Graph Update>> for more information
+about updating command-groups.
+
+===== Limitations
+
+Dynamic command-groups can only be used to update kernels. Trying to update a command-group
+function that contains other operations will result in an error.
+
+All the command-group functions in a dynamic command-group must have identical dependencies.
+It is not allowed for a dynamic command-group to have command-group functions that would
+result in a change to the graph topology when set to active. In practice, this means that
+any calls to `handler.depends_on()` must be identical for all the command-group functions
+in a dynamic command-group.
+
+Table {counter: tableNumber}. Member functions of the `dynamic_command_group` class.
+[cols="2a,a"]
+|===
+|Member Function|Description
+
+|
+[source,c++]
+----
+dynamic_command_group(
+command_graph<graph_state::modifiable> graph,
+const std::vector<std::function<void(handler &)>>& cgfList);
+----
+
+|Constructs a dynamic command-group object that can be added as a node to a `command_graph`.
+
+Parameters:
+
+* `graph` - Graph to be associated with this `dynamic_command_group`.
+* `cgfList` - The list of command-group functions that can be activated for this dynamic command-group.
+              The command-group function at index 0 will be active by default.
+
+Exceptions:
+
+* Throws synchronously with error code `invalid` if the graph wasn't created with
+  the `property::graph::assume_buffer_outlives_graph` property and the `dynamic_command_group`
+  is created with command-group functions that use buffers. See the
+  <<assume-buffer-outlives-graph-property, Assume-Buffer-Outlives-Graph>>
+  property for more information.
+
+* Throws with error code `invalid` if the `dynamic_command_group` is created with
+  command-group functions that are not kernel executions.
+
+* Throws with error code `invalid` if the command-group functions in `cgfList` have
+  event dependencies that are incompatible with each other and would result in
+  different graph topologies when set to active.
+
+|
+[source,c++]
+----
+size_t get_active_cgf();
+----
+|Returns the index of the currently active command-group function in this
+`dynamic_command_group`.
+
+|
+[source,c++]
+----
+void set_active_cgf(size_t cgfIndex);
+----
+| Sets the command-group function with index `cgfIndex` as active. The index of the
+command-group function in a `dynamic_command_group` is identical to its index in the
+`cgfList` vector when it was passed to the `dynamic_command_group` constructor.
+
+This change will be reflected immediately in the modifiable graph which contains this
+`dynamic_command_group`. The new value will not be reflected in any executable graphs
+created from that modifiable graph until `command_graph::update()` is called, passing
+the modified nodes, or a new executable graph is finalized from the modifiable graph.
+
+Setting `cgfIndex` to the index of the currently active command-group function is
+a no-op.
+
+Parameters:
+
+* `cgfIndex` - The index of the command-group function that should be set as active.
+
+Exceptions:
+
+* Throw with error code `invalid` if `cgfIndex` is not a valid index.
+
+|===
+
 ==== Depends-On Property
 
 [source,c++]
@@ -631,6 +739,8 @@ public:
   template<typename T>
   node add(T cgf, const property_list& propList = {});
 
+  node add(dynamic_command_group& dynamicCG, const property_list& propList = {});
+
   void make_edge(node& src, node& dest);
 
   void print_graph(std::string path, bool verbose = false) const;
@@ -711,21 +821,39 @@ Updates to a graph will be scheduled after any in-flight executions of the same
 graph and will not affect previous submissions of the same graph. The user is
 not required to wait on any previous submissions of a graph before updating it.
 
-The only type of nodes that are currently able to be updated in a graph are
-kernel execution nodes.
-
-The aspects of a kernel execution node that can be configured during update are:
-
-* Parameters to the kernel.
-* Execution ND-Range of the kernel.
-
 To update an executable graph, the `property::graph::updatable` property must
 have been set when the graph was created during finalization. Otherwise, an
 exception will be thrown if a user tries to update an executable graph. This
 guarantee allows the backend to provide a more optimized implementation, if
 possible.
 
-===== Individual Node Update
+===== Supported Features
+
+The only types of nodes that are currently able to be updated in a graph are
+kernel execution nodes.
+
+There are two different API's that can be used to update a graph:
+
+* <<individual-node-update, Individual Node Update>> which allows updating
+individual nodes of a command-graph.
+* <<whole-graph-update, Whole Graph Update>> which allows updating the
+entirety of the graph simultaneously by using another graph as a
+reference.
+
+The aspects of a kernel execution node that can be changed during update are
+different depending on the API used to perform the update:
+
+* For the <<individual-node-update, Individual Node Update>> API it's possible to update
+the kernel function, the parameters to the kernel, and the ND-Range.
+* For the <<whole-graph-update, Whole Graph Update>> API, only the parameters of the kernel
+and the ND-Range can be updated.
+
+===== Individual Node Update [[individual-node-update]]
+
+Individual nodes of an executable graph can be updated directly. Depending on the attribute
+of the node that requires updating, different API's should be used:
+
+====== Parameter Updates
 
 Parameters to individual nodes in a graph in the `executable` state can be
 updated between graph executions using dynamic parameters. A `dynamic_parameter`
@@ -739,14 +867,6 @@ Parameter updates are performed using a `dynamic_parameter` instance by calling
 not registered, even if they use the same parameter value as a
 `dynamic_parameter`.
 
-The other node configuration that can be updated is the execution range of the
-kernel, this can be set through `node::update_nd_range()` or
-`node::update_range()` but does not require any prior registration.
-
-The executable graph can then be updated by passing the updated nodes to
-`command_graph<graph_state::executable>::update(node& node)` or
-`command_graph<graph_state::executable>::update(const std::vector<node>& nodes)`.
-
 Since the structure of the graph became fixed when finalizing, updating
 parameters on a node will not change the already defined dependencies between
 nodes. This is important to note when updating buffer parameters to a node,
@@ -762,6 +882,41 @@ dynamic parameter for the buffer can be registered with all the nodes which
 use the buffer as a parameter. Then a single `dynamic_parameter::update()` call
 will maintain the graphs data dependencies.
 
+====== Execution Range Updates
+
+Another configuration that can be updated is the execution range of the
+kernel, this can be set through `node::update_nd_range()` or
+`node::update_range()` but does not require any prior registration.
+
+An alternative way to update the execution range of a node is to do so while
+updating command groups as described in the next section.
+
+====== Command Group Updates
+
+The command-groups of a kernel node can be updated using dynamic command-groups.
+Dynamic command-groups allow replacing the command-group function of a kernel
+node with a different one. This effectively allows updating the kernel function
+and/or the kernel execution range.
+
+Command-group updates are performed by creating an instance of the
+`dynamic_command_group` class. A dynamic command-group is created with a modifiable
+state graph and a list of possible command-group functions. Command-group functions
+within a dynamic command-group can then be set to active by using the member function
+`dynamic_command_group::set_active_cgf()`.
+
+Dynamic command-groups are compatible with dynamic parameters. This means that
+dynamic parameters can be used in command-group functions that are part of
+dynamic command-groups. Updates to such dynamic parameters will be reflected
+in the command-group functions once they are activated.
+
+====== Committing Updates
+
+Updating a node using the methods mentioned above will take effect immediately
+for nodes in modifiable command-graphs. However, for graphs that are in the executable
+state, in order to commit the update, the updated nodes must be passed to
+`command_graph<graph_state::executable>::update(node& node)` or
+`command_graph<graph_state::executable>::update(const std::vector<node>& nodes)`.
+
 ===== Whole Graph Update [[whole-graph-update]]
 
 A graph in the executable state can have all of its nodes updated using the
@@ -1042,6 +1197,42 @@ Exceptions:
 |
 [source,c++]
 ----
+node add(dynamic_command_group& dynamicCG, const property_list& propList = {});
+----
+
+| Adds the dynamic command-group `dynamicCG` as a node to the graph and sets the
+current active command-group function in `dynamicCG` as the executable for future
+executions of this graph node.
+
+The current active command-group function in `dynamicCG` will be executed asynchronously
+when the graph is submitted to a queue. The requisites of this command-group
+function will be used to identify any dependent nodes in the graph
+to form edges with. The other command-group functions in `dynamicCG` will be captured
+into the graph but will not be executed in a graph submission unless they are
+set to active.
+
+Constraints:
+
+* This member function is only available when the `command_graph` state is
+  `graph_state::modifiable`.
+
+Parameters:
+
+* `dynamicCG` - Dynamic command-group object to be added as a node.
+
+* `propList` - Zero or more properties can be provided to the constructed node
+  via an instance of `property_list`. The `property::node::depends_on` property
+  can be passed here with a list of nodes to create dependency edges on.
+
+Returns: The dynamic command-group object node which has been added to the graph.
+
+Exceptions:
+
+* Throws synchronously with error code `invalid` if a queue is recording
+  commands to the graph.
+|
+[source,c++]
+----
 void make_edge(node& src, node& dest);
 ----
 
@@ -1157,8 +1348,9 @@ void update(node& node);
 ----
 
 | Updates an executable graph node that corresponds to `node`. `node` must be a
-kernel execution node. Kernel arguments and the ND-range of the node will be
-updated inside the executable graph to reflect the current values in `node`.
+kernel execution node. The command-group function of the node will be updated,
+inside the executable graph, to reflect the current values in `node`. This
+includes the kernel function, the kernel nd-range and the kernel parameters.
 
 Updating these values will not change the structure of the graph.
 
@@ -1190,9 +1382,10 @@ void update(const std::vector<node>& nodes);
 ----
 
 | Updates all executable graph nodes that corresponds to the nodes contained in
-`nodes`. All nodes must be kernel nodes. Kernel arguments and the ND-range of
-each node will be updated inside the executable graph to reflect the current
-values in each node in `nodes`.
+`nodes`. All nodes must be kernel nodes. The command-group function of each node
+will be updated, inside the executable graph, to reflect the current values in
+`nodes`. This includes the kernel function, the kernel nd-range and the kernel
+parameters".
 
 Updating these values will not change the structure of the graph.
 
@@ -1712,6 +1905,10 @@ the call to `queue::submit()` or `command_graph::add()` along with the calls to
 handler functions and this will not be reflected on future executions of the
 graph.
 
+Similarly, any command-group function inside a `dynamic_command_group` will be
+evaluated once, in index order, when submitted to the graph using
+`command_graph::add()`.
+
 Any code like this should be moved to a separate host-task and added to the
 graph via the recording or explicit APIs in order to be compatible with this
 extension.
@@ -2243,6 +2440,50 @@ node nodeA = myGraph.add([&](handler& cgh) {
 dynParamAccessor.update(bufferB.get_access());
 ----
 
+=== Dynamic Command Groups
+
+Example showing how a graph with a dynamic command group node can be updated.
+
+[source,c++]
+----
+queue Queue{};
+exp_ext::command_graph Graph{Queue.get_context(), Queue.get_device()};
+
+int *PtrA = malloc_device<int>(1024, Queue);
+int *PtrB = malloc_device<int>(1024, Queue)
+
+auto CgfA = [&](handler &cgh) {
+  cgh.parallel_for(1024, [=](item<1> Item) {
+    PtrA[Item.get_id()] = 1;
+  });
+};
+
+auto CgfB = [&](handler &cgh) {
+  cgh.parallel_for(512, [=](item<1> Item) {
+    PtrB[Item.get_id()] = 2;
+  });
+};
+
+// Construct a dynamic command-group with CgfA as the active cgf (index 0).
+auto DynamicCG = exp_ext::dynamic_command_group(Graph, {CgfA, CgfB});
+
+// Create a dynamic command-group graph node.
+auto DynamicCGNode = Graph.add(DynamicCG);
+
+auto ExecGraph = Graph.finalize(exp_ext::property::graph::updatable{});
+
+// The graph will execute CgfA.
+Queue.ext_oneapi_graph(ExecGraph).wait();
+
+// Sets CgfB as active in the dynamic command-group (index 1).
+DynamicCG.set_active_cgf(1);
+
+// Calls update to update the executable graph node with the changes to DynamicCG.
+ExecGraph.update(DynamicCGNode);
+
+// The graph will execute CgfB.
+Queue.ext_oneapi_graph(ExecGraph).wait();
+----
 === Whole Graph Update
 
 Example that shows recording and updating several nodes with different
@@ -2444,6 +2685,16 @@ to ensure this is desired and makes sense to users.
 
 **UNRESOLVED** Needs more discussion.
 
+=== Updatable command-groups in the Record & Replay API:
+
+Currently the only way to update command-groups in a graph is to use the
+Explicit API. There is a limitation in some backends that requires all
+the command-groups used for updating to be specified before the graph
+is finalized. This restriction makes it hard to implement the
+Record & Replay API in a performant manner.
+
+**UNRESOLVED** Needs more discussion.
+
 === Multi Device Graph
 
 Allow an executable graph to contain nodes targeting different devices.