Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does "Exception-handling cannot be used inside a device function" actually mean? #543

Open
tahonermann opened this issue Mar 8, 2024 · 22 comments

Comments

@tahonermann
Copy link

tahonermann commented Mar 8, 2024

Section 5.4 (Language restrictions for device functions) of revision 8 of the SYCL 2020 specification states:

Exception-handling cannot be used inside a device function. noexcept is allowed.

What does "Exception-handling cannot be used" mean? (and why is that hyphen there?)

  • Does it mean that use of exception handling language features makes the program ill-formed?
    • For all exception handling language features? Or just throw expressions? Should try and catch be allowed given that they are allowed in constexpr evaluation in C++20 courtesy of adoption of P1002R1 (Try-catch blocks in constexpr functions)?
    • For standard library functions like std::current_exception()?
  • Or perhaps it means that exception handling language features don't make the program ill-formed, but that evaluating a throw expression perhaps results in some unspecified or undefined behavior?

Would an implementation that allows use of exception handling language features in device code such that evaluation of a throw expression results in an immediate call to std::terminate() or std::abort() be a conforming implementation? Such a translation mode would allow more C++ code, including C++ standard library features, to be usable in device code.

@gmlueck
Copy link
Contributor

gmlueck commented Mar 11, 2024

This is a good point. It does seem like we should allow try / catch blocks in device code for the same reason they are allowed in C++ constexpr functions. If there is no way to throw an exception in device code, the catch blocks are just no-ops, so why disallow them? Therefore, I would be in favor of changing this restriction to say:

A device function must not contain a throw statement.

  • For standard library functions like std::current_exception()?

This is already disallowed in device code by the following bullet point that appears later in this same section of the SYCL spec:

The rules for kernels apply to both the kernel function objects themselves and all functions, operators, member functions, constructors and destructors called by the kernel. This means that kernels can only use library functions that have been adapted to work with SYCL. Implementations are not required to support any library routines in kernels beyond those explicitly mentioned as usable in kernels in this spec.

Would an implementation that allows use of exception handling language features in device code such that evaluation of a throw expression results in an immediate call to std::terminate() or std::abort() be a conforming implementation? Such a translation mode would allow more C++ code, including C++ standard library features, to be usable in device code.

Answering narrowly ... such an implementation would be conformant unless we say that violation of the rules in section 5.4 make a program "ill-formed". (My understanding is that the core C++ spec defines "ill-formed" as a requirement that the implementation must diagnose an error in this case, right?) As the wording is now, we don't say that violations of these requirements make a program ill-formed, so I think it is up to the quality of the implementation to decide whether a diagnostic is emitted.

Answering more broadly ... this doesn't seem like a good implementation to me. Implementing all throw statements as though they were std::abort() seems like it would lead to confusion. If an implementation wants to allow a standard C++ feature in device code, I think it should have the same semantics as in standard C++.

@AlexeySachkov
Copy link
Contributor

Answering more broadly ... this doesn't seem like a good implementation to me. Implementing all throw statements as though they were std::abort() seems like it would lead to confusion. If an implementation wants to allow a standard C++ feature in device code, I think it should have the same semantics as in standard C++.

We could require any SYCL kernel or device function to be noexcept, then each throw is expected to be replaced with std::terminate (well, the exact details are more complicated than that, but this is what happens in most cases by default). At least that is a direction which could be explored. Note that it will require implementations to support some early exits or abnormal terminations of kernels, which is not something that is implemented by every vendor. This would definitely raise an entry bar and produce questions like which memory operations are made visible to host with respect to throw and which are not, i.e. whether or not some "speculative" execution after throw is allowed.

@nliber
Copy link
Collaborator

nliber commented Mar 11, 2024

@AlexeySachkov When you say you want to require that and SYCL kernel/device function be noexcept, do you mean

  • They cannot call anything that is not marked noexcept?
  • They are implicitly marked noexcept?

Both are problematic.

Most functions, even though they don't throw, aren't marked noexcept. Forcing users to write that everywhere is just not practical, makes it next to impossible to use third party libraries, etc.

As of C++17, noexcept is part of the type system, so implicitly declaring kernel/device functions as noexcept can lead to ODR violations.

It also doesn't solve the problem. This is perfectly valid C++ code which doesn't involve functions being declared noexcept:

void foo() {
    try {
        throw 0;
    }
    catch (int) {
        DoSomething();
    }
}

(A non-library only implementation of SYCL could obviously make the above work inside a kernel, but it isn't terribly useful.)

I agree with others in that throw being executed inside the kernel is the issue, and that is what we should focus upon.

@tahonermann
Copy link
Author

The motivation for filing this issue was a developer complaining that they cannot use the value() member of std::optional in kernel code even when there is a check for has_value() that ensures an exception won't be thrown. For example, the following code is rejected: https://godbolt.org/z/59483xcb3.

#include <optional>
#include <sycl.hpp>
void sink(int) {}
int main() {
  sycl::queue q(sycl::cpu_selector_v);
  q.submit(
    [=](sycl::handler &cgh) {
      int i = 42;
      cgh.single_task<class K>(
        [i] {
          std::optional<int> oi;
          if (i) oi = i;
          if (oi.has_value())
            sink(oi.value());
        }
      );
    }
  );
  q.wait();
}

@tahonermann
Copy link
Author

Answering narrowly ... such an implementation would be conformant unless we say that violation of the rules in section 5.4 make a program "ill-formed". (My understanding is that the core C++ spec defines "ill-formed" as a requirement that the implementation must diagnose an error in this case, right?) As the wording is now, we don't say that violations of these requirements make a program ill-formed, so I think it is up to the quality of the implementation to decide whether a diagnostic is emitted.

My understanding of ill-formed matches yours.

Answering more broadly ... this doesn't seem like a good implementation to me. Implementing all throw statements as though they were std::abort() seems like it would lead to confusion. If an implementation wants to allow a standard C++ feature in device code, I think it should have the same semantics as in standard C++.

Ideally, I agree. I don't know the history of why exceptions are not allowed in device code though I can imagine questions of what happens when a worker spawned for a sycl::parallel_for() call throws having non-obvious answers. I imagine treating the host/device divide as a noexcept barrier would be relatively easy to specify and implement though.

Specifying that throw expressions result in a call to std::terminate() (either immediately for generated code or delayed for SYCL library-only implementations or device compilers that support exceptions) would be a stepping stone towards eventually adding full exception support in device code. However, it has the significant downside that unintended use of exceptions today is caught at compile-time while a change to allow throw expressions in device code would postpone catching such issues until run-time.

@tahonermann
Copy link
Author

I agree with others in that throw being executed inside the kernel is the issue, and that is what we should focus upon.

Agreed. That is consistent with use of throw expressions in constexpr functions.

constexpr int f(bool b) {
  try {
    if (b) throw 0;
  } catch (int) {}
  return 0;
}
int main() {
  constexpr int i1 = f(false); // Ok.
  constexpr int i2 = f(true);  // Ill-formed.
}

@gmlueck
Copy link
Contributor

gmlueck commented Mar 11, 2024

The motivation for filing this issue was a developer complaining that they cannot use the value() member of std::optional in kernel code even when there is a check for has_value() that ensures an exception won't be thrown. For example, the following code is rejected: https://godbolt.org/z/59483xcb3.

I think this illustrates a common problem that we will have if we want to allow applications to call (some of the) std functions from device code. Any function that throws an exception under certain circumstances will have this problem. The list of restrictions in section 5.4 and the optional kernel features listed in section 5.7 apply when these features are present in device functions, regardless of whether the code using the features is actually executed. I think we want to maintain this wording, otherwise it will be impossible to diagnose cases when kernels inadvertently use features that are incompatible with device code.

Ideally, I agree. I don't know the history of why exceptions are not allowed in device code though I can imagine questions of what happens when a worker spawned for a sycl::parallel_for() call throws having non-obvious answers. I imagine treating the host/device divide as a noexcept barrier would be relatively easy to specify and implement though.

This last part isn't true. It's actually quite difficult to implement std::abort on some devices. This is the main reason we don't have assert in device code.

This is also the reason we prohibit the features listed in section 5.7 "Optional kernel features" from appearing in device code (vs. being used in device code). We can diagnose an error at the point when a kernel is submitted to a device if the kernel contains a feature that the device doesn't support. The logic for doing this is all on the host. By contrast, if we allowed a kernel to contain an unsupported feature and tried to diagnose an error only if the feature is actually used, then we would somehow need to implement assert-like functionality in the device.

@tahonermann
Copy link
Author

This last part isn't true. It's actually quite difficult to implement std::abort on some devices. This is the main reason we don't have assert in device code.

Thank you, I wasn't aware of the motivation for these limitations.

@keryell
Copy link
Member

keryell commented Mar 21, 2024

I don't know the history of why exceptions are not allowed in device code

Because it is not handled well by others APIs SYCL was built on top of at the first place.

though I can imagine questions of what happens when a worker spawned for a sycl::parallel_for() call throws having non-obvious answers.

Actually we speculatively followed some optimistic C++17 std::exception_list proposals and agreed on handling sycl::exception_list to have you covered. std::exception_list was abandoned but sycl::exception_list is still here, even if it is not used for this case anymore.

@keryell
Copy link
Member

keryell commented Mar 21, 2024

Should try and catch be allowed given that they are allowed in constexpr evaluation in C++20 courtesy of adoption of P1002R1 (Try-catch blocks in constexpr functions)?

This will be handled by #388

@gmlueck
Copy link
Contributor

gmlueck commented Mar 21, 2024

Should try and catch be allowed given that they are allowed in constexpr evaluation in C++20 courtesy of adoption of P1002R1 (Try-catch blocks in constexpr functions)?

This will be handled by #388

I think there might be some confusion here. I believe @tahonermann is suggesting that try and catch should be allowed in any device code, not just in constant expressions. The change proposed in #388 would only allow them in device code that is a constant expression.

@tahonermann
Copy link
Author

Thank you for that correction, @gmlueck. Yes, you understood my intent correctly. I read #388, but it wasn't clear (to me) from its description how relevant it is to this request.

The suggestion in this issue is to allow a function like the following in device code unconditionally.

void f1(bool b) {
  try {
    f2(b);
  }
  catch (...) {
  }
}

And to perhaps allow a function like the following with a restriction that calling it with a true argument for b results in some kind of error at run-time (regardless of whether the call is made in a try context as in f1() above).

void f2(bool b) {
  if (b) {
    throw 42;
  }
}

Alternatively, a function that contains a throw expression could remain ill-formed, but that would fall short of the goal to allow use of std::optional in device code.

@illuhad
Copy link
Contributor

illuhad commented Mar 25, 2024

Just tried, @tahonermann's godbolt example compiles and runs with AdaptiveCpp generic SSCP compiler on CPU/Intel GPU/NVIDIA GPU/AMD GPU. But I'm not sure how reliable it is.

Here it just generates an empty kernel, so it's quite possible that in this simple example it could just optimize anything away that could cause a problem.

I wouldn't be surprised if some backends choked if some instructions around try/catch actually made it to backend code generation.

@nliber
Copy link
Collaborator

nliber commented Mar 25, 2024

Instead of trying to support all of something like optional, would it be sufficient to just support the freestanding parts? I do t think any of those require exceptions.

@nliber
Copy link
Collaborator

nliber commented Mar 25, 2024

Instead of trying to support all of something like optional, would it be sufficient to just support the freestanding parts? I don't think any of those require exceptions.

@gmlueck
Copy link
Contributor

gmlueck commented Mar 25, 2024

I don't know if the freestanding definition helps us much here. My knowledge of "freestanding" is limited to what cppreference says, but it seems like <optional> isn't one of the headers that is guaranteed to be present in a freestanding implementation. The cppreference page does say this, though:

GCC 13 provides more headers, such as <optional>, <span>, <array>, and <bitset>, for freestanding, although these headers may not be portable or provide the same capabilities as a hosted implementation. It is better to avoid using them in a freestanding environment, even if the toolchain provides them.

In addition, I see that <exception> is one of the headers that's guaranteed to be supported in a freestanding implementation, so that makes me think that a freestanding implementation is expected to implement exceptions.

@tahonermann
Copy link
Author

Instead of trying to support all of something like optional, would it be sufficient to just support the freestanding parts? I do t think any of those require exceptions.

Perhaps. It looks like the only parts of std::optional that are excluded for freestanding are the value() overloads ([optional.optional.general]). The code that motivated filing this issue was calling value(). My proposed workaround to the author of that code to use value_or() instead was met with complaints of the workaround being ugly and that CUDA doesn't require such workarounds. The reason (nvcc) CUDA doesn't require such a workaround is because NVIDIA provides a custom implementation of std::optional for use in device code.

@gmlueck, freestanding requires full C++ language support ([intro.compliance.general]p7) and support for the <exception> and <optional> headers ([compliance]p2).

@gmlueck
Copy link
Contributor

gmlueck commented Mar 25, 2024

Does the CUDA implementation actually throw an exception if the requested value does not exist? If so, can you catch and handle the exception?

@TApplencourt
Copy link
Contributor

Sound unlikely (at least in pure cuda):

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#exception-handling

Exception handling is only supported in host code, but not in device code.
Exception specification is not supported for __global__ functions.

@tahonermann
Copy link
Author

Does the CUDA implementation actually throw an exception if the requested value does not exist? If so, can you catch and handle the exception?

The value() members are conditionalized to throw an exception if exceptions are enabled and to call an abort() function otherwise. The nvcc compiler rejects use of throw expressions in device code. https://godbolt.org/z/jdYTndWbo.

@illuhad
Copy link
Contributor

illuhad commented Mar 28, 2024

NVC++ works with throw and optional
https://godbolt.org/z/n91EYjTsK
https://godbolt.org/z/vdefxses5

@TApplencourt
Copy link
Contributor

Can even take the "std::optional" as a parameter (https://godbolt.org/z/no94G1sK6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants