Lesson_Materials/Lesson-2-SYCL-Topology-Discovery-and-Queue-Creation/index.html

<!DOCTYPE html>

<html>
	<head>
	    <meta charset="utf-8">
		<link rel="stylesheet" href="../common-revealjs/css/reveal.css">
		<link rel="stylesheet" href="../common-revealjs/css/theme/white.css">
		<link rel="stylesheet" href="../common-revealjs/css/custom.css">
		<script>
			// This is needed when printing the slides to pdf
			var link = document.createElement( 'link' );
			link.rel = 'stylesheet';
			link.type = 'text/css';
			link.href = window.location.search.match( /print-pdf/gi ) ? 'css/print/pdf.css' : 'css/print/paper.css';
			document.getElementsByTagName( 'head' )[0].appendChild( link );
		</script>
		<script>
		    // This is used to display the static images on each slide,
			// See global-images in this html file and custom.css
			(function() {
				if(window.addEventListener) {
					window.addEventListener('load', () => {
						let slides = document.getElementsByClassName("slide-background");

						if (slides.length === 0) {
							slides = document.getElementsByClassName("pdf-page")
						}

						// Insert global images on each slide
						for(let i = 0, max = slides.length; i < max; i++) {
							let cln = document.getElementById("global-images").cloneNode(true);
							cln.removeAttribute("id");
							slides[i].appendChild(cln);
						}

						// Remove top level global images
						let elem = document.getElementById("global-images");
						elem.parentElement.removeChild(elem);
					}, false);
				}
			})();
		</script>
		
	</head>
	<body>
		<div class="reveal">
			<div class="slides">
				<div id="global-images" class="global-images">
					<img src="../common-revealjs/images/sycl_academy.png" />
					<img src="../common-revealjs/images/sycl_logo.png" />
					<img src="../common-revealjs/images/trademarks.png" />
					<img src="../common-revealjs/images/codeplay.png" />
				</div>
				<!--Slide 1-->
				<section class="hbox">
					<div class="hbox" data-markdown>
						## SYCL Topology Discovery and Queue Creation
					</div>
				</section>
				<!--Slide 2-->
				<section class="hbox" data-markdown>
					## Learning Objectives
					
					* Learn about the SYCL system topology and how to traverse it
					* Learn how to query information about a platform or device
					* Learn how to select a device; both manually and using device selectors
					* Learn how to configure a queue
					* Learn about the SYCL scheduler and how to enqueue work
                </section>
				<!--Slide 3-->
				<section class="hbox" data-markdown>
				    ## SYCL System Topology
				</section>
				<!--Slide 4-->
				<section>
					<div class="hbox" data-markdown>
						* A SYCL application can execute work across a range of different heterogeneous devices
						* The devices that are available in any given system are determined at runtime through topology discovery
					</div>
				</section>
				<!--Slide 5-->
				<section>
					<div class="container" data-markdown>
						* The SYCL runtime will discover a set of platforms that are available in the system
						  * Each platform represents a backend implementation such as Intel OpenCL and Nvidia OpenCL
						* The SYCL runtime will also discover all the devices available for each of those platforms
						  * CPU, GPU, FPGA, and other kinds of accelerators
					</div>
					<div class="col" data-markdown>
						![SYCL](../common-revealjs/images/devices-1.png "SYCL-Devices")
					</div>
				</section>
				<!--Slide 6-->
				<section>
					<div class="container" data-markdown>
						* In SYCL there is also a host device which executes SYCL kernels as native C++
						  * The host device emulates the execution and memory model of a SYCL device
						* This is very useful for debugging SYCL kernels
						* There is only ever one host device and that device is associated with a host platform
						  * This is generally a CPU implementation
					</div>
					<div class="container">
						<div class="col" data-markdown>
							![SYCL](../common-revealjs/images/devices-2.png "SYCL-Devices")
						</div>
					</div>
				</section>
				<!--Slide 7-->
				<section>
					<div class="container" data-markdown>
						* Platforms and devices are represented by the **platform** and **device** classes respectively
						* A default constructed platform object represents the host platform
						* A default constructed device object represents the host device
					</div>
					<div class="container">
						<div class="col" data-markdown>
							![SYCL](../common-revealjs/images/devices-3.png "SYCL-Devices")
						</div>
					</div>
				</section>
				<!--Slide 8-->
				<section class="hbox">
					<div class="hbox" data-markdown>
						* In SYCL there are two ways to query a system’s topology 
						  * The topology can be manually queried and iterated over via APIs of the platform and device classes 
						  * The topology can be automatically queried and iterated over using a use  specified heuristic by a device selector object
					</div>
                </section>
				<!--Slide 9-->
				<section class="hbox" data-markdown>
				    ## Querying the topology manually
				</section>
				<!--Slide 10-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto platforms = platform::get_platforms();

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![SYCL](../common-revealjs/images/devices-5.png "SYCL-Devices")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* The platform class provides the static function **get_platforms**
							  * It retrieves a vector of all available platforms in the system
							* This includes the host platform
					</div>
				</section>
				<!--Slide 11-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto intelDevices = intelPlatform.get_devices();

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![SYCL](../common-revealjs/images/devices-6.png "SYCL-Devices")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* The platform class provides the member function **get_devices** that
							  * It returns a vector of all devices associated with that platform
							* This includes the host device if the platform object represents a host platform
					</div>
				</section>
				<!--Slide 12-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto devices = device::get_devices();

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![SYCL](../common-revealjs/images/devices-7.png "SYCL-Devices")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* The device class also provides the static function **get_devices** 
							  * It retrieves a vector of all available devices in the system
							* This includes the host device
					</div>
				</section>
				<!--Slide 13-->
				<section class="hbox" data-markdown>
				    ## Querying the topology using a device selector
				</section>
				<!--Slide 14-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![Device Topology](../common-revealjs/images/topology-1.png "Device-Topology")
						</div>
						<div class="col-right-1" data-markdown>
							* To simplify the process of traversing the system topology SYCL provides device selectors
							* A device selector is is a C++ function object, derived from the **device_selector** class, which defines a heuristic for scoring devices
							* SYCL provides a number of standard device selectors, e.g. **default_selector**, **gpu_selector**, etc
							* Users can also create their own device selectors
						</div>
					</div>
				</section>
				<!--Slide 15-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto gpuSelector = gpu_selector{};
auto gpuDevice = gpuSelector.select_device();
							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![Device Topology](../common-revealjs/images/topology-1.png "Device-Topology")
						</div>
					</div>
					<div class="container" data-markdown>
							* The **device_selector** class provides the member function **select_device**
							  * Queries all devices and returns the one with the highest "score"
							* A device with a negative score will never be chosen
					</div>
				</section>
				<!--Slide 16-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto defSelector = default_selector{};
auto chosenDevice = defSelector.select_device();
							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![Device Topology](../common-revealjs/images/topology-1.png "Device-Topology")
						</div>
					</div>
					<div class="container" data-markdown>
							* The **default_selector** is a standard device selector type
							  * Chooses a device based on an implementation defined heuristic
					</div>
				</section>
				<!--Slide 17-->
				<section>
					<div class="hbox" data-markdown>
						## Custom Device Selectors
					</div>
				</section>
				<!--Slide 18-->
				<section>
					<div class="container">
						<code><pre>
#include &ltCL/sycl.hpp&gt
using namespace cl::sycl;

struct gpu_selector : <mark>public device_selector</mark> {

  <mark>int operator()(const device& dev) const override {</mark>

  <mark>}</mark>
};

int main(int argc, char *argv[]) {


}
						</code></pre>
					</div>
					<div class="hbox" data-markdown>
						* A device selector must inherit from the **device_selector** class
						* A device selector must have a function call operator which takes a reference to a device
					</div>
				</section>
				<!--Slide 19-->
				<section>
					<div class="container">
						<code><pre>
#include &ltCL/sycl.hpp&gt 
using namespace cl::sycl;

struct gpu_selector : public device_selector {

  int operator()(const device& dev) const override {
    <mark>if (dev.is_gpu()){</mark>
      <mark>return 1;</mark>
    <mark>}</mark>
    <mark>else {</mark>
      <mark>return -1;</mark>
    <mark>}</mark>
  }
};

int main(int argc, char *argv[]) {


}
						</code></pre>
					</div>
					<div class="hbox" data-markdown>
						* The body of the function call operator defines the heuristic for selecting devices
						* This is where you write the logic for scoring each device
					</div>
				</section>
				<!--Slide 20-->
				<section>
					<div class="container">
						<code><pre>
#include &ltCL/sycl.hpp&gt 
using namespace cl::sycl;

struct gpu_selector : public device_selector {

  int operator()(const device& dev) const override {
    if (dev.is_gpu()){ 
	  return 1; 
	} 
	else { 
	  return -1; 
	}


  }
};

int main(int argc, char *argv[]) {
  <mark>auto gpuQueue = queue{gpu_selector{}};</mark>

}
						</code></pre>
					</div>
					<div class="hbox" data-markdown>
						* Now that there is a device selector that chooses a specific device we can use that to construct a queue
					</div>
				</section>
				<!--Slide 21-->
				<section class="hbox" data-markdown>
				    ## Selecting A SYCL Device
				</section>
				<!--Slide 22-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto plt = dev.get_platform();
auto platformName = dev.get_info&ltinfo::device::name&gt();

							</code></pre>
						</div>
						<div class="col-right-2" data-markdown>
						    ![SYCL](../common-revealjs/images/topology-2.png "SYCL-Devices")
						</div>
					</div>
					<div class="container" data-markdown>
							* Information about platforms and devices can be queried using the template member function **get_info**
							* The info that you are querying is specified by the template parameter
							* You can also query a device for its associated platform with the **get_platform** member function
					</div>
				</section>
				<!--Slide 23-->
				<section class="hbox" data-markdown>
					## What is a SYCL Queue
				</section>
				<!--Slide 24-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![Queue](../common-revealjs/images/queue-1.png "SYCL-Queue")
						</div>
						<div class="col-right-1" data-markdown>
							* In SYCL the underlying execution and memory resources of a platform and its devices is managed by creating a context
							* A context represents one or more devices, but all devices must be associated with the same platform
						</div>
					</div>
				</section>
				<!--Slide 25-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![Queue](../common-revealjs/images/queue-2.png "SYCL-Queue")
						</div>
						<div class="col-right-1" data-markdown>
							* In SYCL the object which is used to submit work is the queue
							* A queue processes command groups and submits commands to the SYCL scheduler for a particular context and device
							
						</div>
					</div>
				</section>
				<!--Slide 26-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![Queue](../common-revealjs/images/queue-3.png "SYCL-Queue")
						</div>
						<div class="col-right-1" data-markdown>
							* A single SYCL application will often want to target multiple different devices
							* This can be useful for task level parallelism and load balancing
						</div>
					</div>
				</section>
				<!--Slide 27-->
				<section class="hbox" data-markdown>
				    ## Creating a Queue
				</section>
				<!--Slide 28-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto defaultQueue = queue{};

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![SYCL](../common-revealjs/images/queue-4.png "SYCL-Queue")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* A default constructed queue object will use the **default_selector** to choose a device and create an implicit context
					</div>
				</section>
				<!--Slide 29-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
auto intelGPUSelector = intel_gpu_selector{};
auto intelGPUQeuue = queue{intelGPUSelector};

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
						    ![SYCL](../common-revealjs/images/queue-4.png "SYCL-Queue")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* A queue object can be constructed from a device selector which is used to choose a device and create an implicit context
					</div>
				</section>
				<!--Slide 30-->
				<section class="hbox" data-markdown>
				    ## Submitting Work to a Queue
				</section>
				<!--Slide 31-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![SYCL](../common-revealjs/images/queue-6.png "SYCL-Queue")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* In SYCL work is submitted via a queue object
							* This is done using the **submit** member function
							* This will process the command group and submit commands to the SYCL scheduler for the context and device associated with the queue
					</div>
				</section>
				<!--Slide 32-->
				<section>
					<div class="container">
						<div class="col-left-1" data-markdown>
							![SYCL](../common-revealjs/images/queue-7.png "SYCL-Queue")
						</div>
					</div>
					<div class="smaller-size-font" data-markdown>
							* The same scheduler is used for all queues in order to share dependency information
					</div>
				</section>
				<!--Slide 33-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
#include &ltCL/sycl.hpp&gt using namespace cl::sycl;
int main(int argc, char *argv[]) {
  queue gpuQueue(gpu_selector{});
  <mark>gpuQueue.submit([&](handler &cgh){</mark>
  
    // Command group
  
  <mark>});</mark>
  
  
}

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
							* The **submit** member function takes a C++ function object, which takes a reference to a **handler** object
							* The function object can be a lambda or a class with a function call operator
							* The body of the function object represents the command group that is being submitted
							* The handler object is created by the SYCL runtime and is used to link commands and requirements declared inside the command group
						</div>
					</div>
				</section>
				<!--Slide 34-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
#include &ltCL/sycl.hpp&gt using namespace cl::sycl;
int main(int argc, char *argv[]) {
  queue gpuQueue(gpu_selector{});
  <mark>gpuQueue.submit([&](handler &cgh){</mark>
  
    // Command group
  
  <mark>});</mark>
  
  
}

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
							* The command group is processed exactly once when **submit** is called
							* At this point all the commands and requirements declared inside the command group are collected together, processed and passed on to the scheduler
							* The work is then enqueued to the device asynchronously by the SYCL scheduler, potentially in another thread
						</div>
					</div>
				</section>
				<!--Slide 35-->
				<section>
					<div class="container">
						<div class="col-left-1">
							<code><pre>
#include &ltCL/sycl.hpp&gt using namespace cl::sycl;
int main(int argc, char *argv[]) {
  queue gpuQueue(gpu_selector{});
  gpuQueue.submit([&](handler &cgh){
  
    // Command group
  
  });
  
  <mark>gpuQueue.wait();</mark>
  
}

							</code></pre>
						</div>
						<div class="col-right-1" data-markdown>
							* The queue object will not wait for work to complete on destruction
							* There are other ways to wait for work to complete if you have data dependencies
							* But it's often useful to be able to explicitly wait on a queue to complete any outstanding work
						</div>
					</div>
				</section>
				<!--Slide 36-->
				<section>
					<div class="hbox" data-markdown>
						## Handling Errors in SYCL
					</div>
				</section>
				<!--Slide 37-->
				<section class="hbox" data-markdown>
					* In SYCL errors are handled by throwing exceptions
					  *  It is crucial that these errors are handled otherwise your application may silently fail
					* In SYCL there are two kinds of error
					  * Synchronous errors (thrown in user thread) 
					  * Asynchronous errors (thrown by the SYCL runtime)
				</section>
				<!--Slide 38-->
				<section class="hbox">
					<div class="hbox" data-markdown>
						![SYCL](../common-revealjs/images/sycl-exceptions.png "SYCL")
					</div>
				</section>
				<!--Slide 39-->
				<section>
					<div class="hbox" >
						<code class="code-60pc"><pre>
#include &ltCL/sycl.hpp&gt
using namespace cl::sycl;
class add;

int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  queue gpuQueue(gpu_selector{});

  buffer&ltfloat, 1&gt bufA(dA.data(), range&lt1&gt(dA.size())); 
  buffer&ltfloat, 1&gt bufB(dB.data(), range&lt1&gt(dB.size())); 
  buffer&ltfloat, 1&gt bufO(dO.data(), range&lt1&gt(dO.size()));

  gpuQueue.submit([&](handler &cgh){
    auto inA = bufA.get_access&ltaccess::mode::read&gt(cgh); 
    auto inB = bufB.get_access&ltaccess::mode::read&gt(cgh); 
    auto out = bufO.get_access&ltaccess::mode::write&gt(cgh);

    cgh.parallel_for&ltadd&gt(range&lt1&gt(dA.size()), [=](id&lt1&gt i){
      out[i] = inA[i] + inB[i];
    }); 
  }); 
  gpuQueue.wait();
}
						</code></pre>
					</div>
					<div class="bottom-bullets" data-markdown>
							* If errors are not handled, the application can fail silently
					</div>
				</section>
				<!--Slide 40-->
				<section>
					<div class="hbox">
						<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  <mark>try{</mark>
    queue gpuQueue(gpu_selector{});

    buffer&ltfloat, 1&gt bufA(dA.data(), range&lt1&gt(dA.size())); 
    buffer&ltfloat, 1&gt bufB(dB.data(), range&lt1&gt(dB.size())); 
    buffer&ltfloat, 1&gt bufO(dO.data(), range&lt1&gt(dO.size()));

    gpuQueue.submit([&](handler &cgh){
      auto inA = bufA.get_access&ltaccess::mode::read&gt(cgh); 
      auto inB = bufB.get_access&ltaccess::mode::read&gt(cgh); 
      auto out = bufO.get_access&ltaccess::mode::write&gt(cgh);

      cgh.parallel_for&ltadd&gt(range&lt1&gt(dA.size()), [=](id&lt1&gt i){
        out[i] = inA[i] + inB[i];
      }); 
    });
    gpuQueue.wait();
  <mark>} catch (...) { /* handle errors */ }</mark>
}
						</code></pre>
					</div>
					<div class="bottom-bullets" data-markdown>
							* Synchronous errors are typically thrown by SYCL API functions
							* In order to handle all SYCL errors you must wrap everything in a try-catch block
					</div>
				</section>
				<!--Slide 41-->
				<section>
					<div class="hbox">
						<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try{
    queue gpuQueue(gpu_selector{}, <mark>async_handler{}</mark>);

    buffer&ltfloat, 1&gt bufA(dA.data(), range&lt1&gt(dA.size())); 
    buffer&ltfloat, 1&gt bufB(dB.data(), range&lt1&gt(dB.size())); 
    buffer&ltfloat, 1&gt bufO(dO.data(), range&lt1&gt(dO.size()));

    gpuQueue.submit([&](handler &cgh){
      auto inA = bufA.get_access&ltaccess::mode::read&gt(cgh); 
      auto inB = bufB.get_access&ltaccess::mode::read&gt(cgh); 
      auto out = bufO.get_access&ltaccess::mode::write&gt(cgh);

      cgh.parallel_for&ltadd&gt(range&lt1&gt(dA.size()), [=](id&lt1&gt i){
        out[i] = inA[i] + inB[i];
      }); 
    });
    <mark>gpuQueue.wait_and_throw();</mark>
  } catch (...) { /* handle errors */
}
						</code></pre>
					</div>
					<div class="bottom-bullets" data-markdown>
							* Asynchronous errors errors that may have occurred will be thrown after a command group has been submitted to a queue 
							  * To handle these errors you must provide an async handler when constructing the queue object
							* Then you must also call the **throw_asynchronous or wait_and_throw** member functions of the queue class
							* This will pass the exceptions to the async handler in the user thread so they can be thrown
					</div>
				</section>
				<!--Slide 42-->
				<section>
					<div class="hbox">
						<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try{
    queue gpuQueue(gpu_selector{}, <mark>[=](sycl::exception_list eL) {</mark>
        <mark>for (auto e : eL) { std::rethrow_exception(e); }</mark>
    <mark>}</mark>);

    buffer&ltfloat, 1&gt bufA(dA.data(), range&lt1&gt(dA.size())); 
    buffer&ltfloat, 1&gt bufB(dB.data(), range&lt1&gt(dB.size())); 
    buffer&ltfloat, 1&gt bufO(dO.data(), range&lt1&gt(dO.size()));

    gpuQueue.submit([&](handler &cgh){ // Command group submitted to queue
      auto inA = bufA.get_access&ltaccess::mode::read&gt(cgh); 
      auto inB = bufB.get_access&ltaccess::mode::read&gt(cgh); 
      auto out = bufO.get_access&ltaccess::mode::write&gt(cgh);

      cgh.parallel_for&ltadd&gt(range&lt1&gt(dA.size()), [=](id&lt1&gt i){
        out[i] = inA[i] + inB[i];
      }); 
    });	
    gpuQueue.wait_and_throw(); } catch (...) { /* handle errors */ }
}
						</code></pre>
					</div>
					<div class="bottom-bullets" data-markdown>
							* The async handler is a C++ lambda or function object that takes as a parameter an **exception_list**
							* The exception_list class is a wrapper around a list of **exception_ptrs** which can be iterated over
							* The exception_ptrs can be rethrown by passing them to **std::rethrow_exception**
					</div>
				</section>
				<!--Slide 43-->
				<section>
						<div class="hbox">
							<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try { 
    queue gpuQueue(gpu_selector{}, [=](sycl::exception_list eL) {
      for (auto e : eL) { std::rethrow_exception(e); } 
    });
  ...
    gpuQueue.wait_and_throw(); 
  } catch (<mark>std::exception e</mark>) { 
    <mark>std::cout << “Exception caught: ” << e.what()</mark>
      <mark><< std::endl;</mark>
  }
}
							</code></pre>
						</div>
						<div class="bottom-bullets" data-markdown>
							* Once caught, a SYCL exception can provide information about the error
							* The **what** member function will return a string with more details
						</div>
				</section>
				<!--Slide 44-->
				<section>
						<div class="hbox">
							<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try { 
    queue gpuQueue(gpu_selector{}, [=](sycl::exception_list eL) {
      for (auto e : eL) { std::rethrow_exception(e); } 
    });
  ...
    gpuQueue.wait_and_throw(); 
  } catch (std::exception e) { 
    std::cout << “Exception caught: ” << e.what();
    <mark>std:: cout << “ With OpenCL error code: ”</mark> 
     <mark><< e.get_cl_code() << std::endl;</mark>
  }
}
							</code></pre>
						</div>
						<div class="bottom-bullets" data-markdown>
							* If the exception has an OpenCL error code associated with it this can be retrieved by calling the <mark>get_cl_code</mark> member function
							* If there is no OpenCL error code this will return <mark>CL_SUCCESS</mark>
						</div>
				</section>
				<!--Slide 45-->
				<section>
						<div class="hbox">
							<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try { 
    queue gpuQueue(gpu_selector{}, [=](sycl::exception_list eL) {
      for (auto e : eL) { std::rethrow_exception(e); } 
    });
  ...
    gpuQueue.wait_and_throw(); 
  } catch (std::exception e) { 
    <mark>if (e.has_context()) {</mark>
      <mark>if (e.get_context() == gpuContext) {</mark>
        <mark>/* handle error */</mark>
      <mark>}</mark>
    <mark>}</mark>
  }
}
							</code></pre>
						</div>
						<div class="bottom-bullets" data-markdown>
							* The **has_context** member function will tell you if there is a SYCL context associated with the error
							* If that returns true then the **get_context** member function will return the associated SYCL context object
						</div>
				</section>
				<!--Slide 46-->
				<section>
					<div class="hbox" data-markdown>
						## Debugging SYCL Kernel Functions
					</div>
				</section>
				<!--Slide 47-->
				<section>
					<div class="hbox" data-markdown>
						*  Every SYCL implementation is required to provide a host device
						  * This device executes native C++ code but is guaranteed to emulate the SYCL execution and memory model
						* This means you can debug a SYCL kernel function by switching to the host device and using a standard C++ debugger
						  * For example gdb
					</div>
				</section>
				<!--Slide 48-->
				<section>
						<div class="hbox">
							<code class="code-60pc"><pre>
int main(int argc, char *argv[]) {
  std::vector&ltfloat&gt dA{ 7, 5, 16, 8 }, dB{ 8, 16, 5, 7 }, dO{ 0, 0, 0, 0 };
  try{
    <mark>queue hostQueue(host_selector{}, async_handler{});</mark>
    buffer&ltfloat, 1&gt bufA(dA.data(), range&lt1&gt(dA.size())); 
    buffer&ltfloat, 1&gt bufB(dB.data(), range&lt1&gt(dB.size())); 
    buffer&ltfloat, 1&gt bufO(dO.data(), range&lt1&gt(dO.size()));

    <mark>hostQueue</mark>.submit([&](handler &cgh){
      auto inA = bufA.get_access&ltaccess::mode::read&gt(cgh); 
      auto inB = bufB.get_access&ltaccess::mode::read&gt(cgh); 
      auto out = bufO.get_access&ltaccess::mode::write&gt(cgh);

      cgh.parallel_for&ltadd&gt(range&lt1&gt(dA.size()), 
       [=](id&lt1&gt i){out[i] = inA[i] + inB[i];}); 
    });
    gpuQueue.wait_and_throw();
  } catch (...) { /* handle errors */ }
}
							</code></pre>
						</div>
						<div class="bottom-bullets" data-markdown>
							* Any SYCL application can be debugged on the host device by switching the queue for a host queue
							* By replacing the device selector for the host_selector will ensure that the queue submits all work to the host device
						</div>
				</section>
				<!--Slide 49-->
				<section data-markdown>
					## Questions
				</section>
			</div>
		</div>
		<script src="../common-revealjs/js/reveal.js"></script>
		<script src="../common-revealjs/plugin/markdown/marked.js"></script>
		<script src="../common-revealjs/plugin/markdown/markdown.js"></script>
		<script src="../common-revealjs/plugin/notes/notes.js"></script>
		<script>
			Reveal.initialize({mouseWheel: true, defaultNotes: true});
			Reveal.configure({ slideNumber: true });
		</script>
	</body>
</html>