-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvidia app closing russian roulette #3672
Comments
What we know so far:
Coredump stack trace sample (signal handling / cleanup code disabled)Notable threads are: 1, 29, and 14.
|
@RAOF Can you check out the stack trace and see if there are any suspicious threads/codepaths? |
Combing through more stack traces, I accidentally stumbled upon the destructor of Now, leaking resources is obviously not the solution. So I'll try looking around for a bit to see if I can get things to work on my own. |
The stream is created using |
Maybe delegate Edit: Didn't work. Still haven't checked out whether it breaks in the same places or not. patchdiff --git a/src/platforms/eglstream-kms/server/buffer_allocator.cpp b/src/platforms/eglstream-kms/server/buffer_allocator.cpp
index 919dfe4ae5..7188dfbc5f 100644
--- a/src/platforms/eglstream-kms/server/buffer_allocator.cpp
+++ b/src/platforms/eglstream-kms/server/buffer_allocator.cpp
@@ -106,11 +106,12 @@ GLuint gen_texture_handle()
struct EGLStreamTextureConsumer
{
- EGLStreamTextureConsumer(std::shared_ptr<mir::renderer::gl::Context> ctx, EGLStreamKHR&& stream)
+ EGLStreamTextureConsumer(std::shared_ptr<mir::renderer::gl::Context> ctx, EGLStreamKHR&& stream, std::shared_ptr<mgc::EGLContextExecutor> egl_delegate)
: dpy{eglGetCurrentDisplay()},
stream{std::move(stream)},
texture{gen_texture_handle()},
- ctx{std::move(ctx)}
+ ctx{std::move(ctx)},
+ egl_delegate{egl_delegate}
{
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_EXTERNAL_OES, texture);
@@ -129,23 +130,19 @@ struct EGLStreamTextureConsumer
~EGLStreamTextureConsumer()
{
- bool const need_context = eglGetCurrentContext() == EGL_NO_CONTEXT;
- if (need_context)
- {
- ctx->make_current();
- }
- /* eglDestroyStreamKHR(dpy, stream); */
- glDeleteTextures(1, &texture);
- if (need_context)
- {
- ctx->release_current();
- }
+ egl_delegate->spawn(
+ [dpy = this->dpy, stream = std::move(this->stream), texture = texture]
+ {
+ eglDestroyStreamKHR(dpy, stream);
+ glDeleteTextures(1, &texture);
+ });
}
EGLDisplay const dpy;
EGLStreamKHR const stream;
GLuint const texture;
std::shared_ptr<mir::renderer::gl::Context> const ctx;
+ std::shared_ptr<mgc::EGLContextExecutor> egl_delegate;
};
struct Sync
@@ -174,7 +171,7 @@ struct Sync
struct BoundEGLStream
{
- static void associate_stream(wl_resource* buffer, std::shared_ptr<mir::renderer::gl::Context> ctx, EGLStreamKHR stream)
+ static void associate_stream(wl_resource* buffer, std::shared_ptr<mir::renderer::gl::Context> ctx, EGLStreamKHR stream, std::shared_ptr<mgc::EGLContextExecutor> egl_delegate)
{
BoundEGLStream* me;
if (auto notifier = wl_resource_get_destroy_listener(buffer, &on_buffer_destroyed))
@@ -192,7 +189,7 @@ struct BoundEGLStream
wl_resource_add_destroy_listener(buffer, &me->destruction_listener);
}
- me->producer = std::make_shared<EGLStreamTextureConsumer>(std::move(ctx), std::move(stream));
+ me->producer = std::make_shared<EGLStreamTextureConsumer>(std::move(ctx), std::move(stream), egl_delegate);
}
class TextureHandle
@@ -308,7 +305,7 @@ try
BOOST_THROW_EXCEPTION((mg::egl_error("Failed to create EGLStream from Wayland buffer")));
}
- BoundEGLStream::associate_stream(buffer, allocator->wayland_ctx, stream);
+ BoundEGLStream::associate_stream(buffer, allocator->wayland_ctx, stream, allocator->egl_delegate);
allocator->wayland_ctx->release_current();
}
|
The workaround of leaking resources "works", but also crashes/freezes (refuses to open any new windows) once VRAM is full. |
Barring any misuse of the eglstream API, we're suspecting that this might be a driver bug. |
This seems to happen with multiple applications, but the most common crashes occur with firefox (unsnapped) and kgx.
With an external monitor hooked up to my nvidia GPU, there's a random chance that Mir will freeze (not crash) upon closing a window of one the aforementioned applications (among others, but those two are the two I most open and close windows with). @AlanGriffiths confirmed that this doesn't happen on his machine which doesn't use nvidia graphics.
I tried both inspecting a coredump and running mir under a debugger with and without cleanup signal handling code disabled. It appears that no signal is raised when the bug is triggered as gdb doesn't stop, I have to manually stop using
ctrl + c
. The stack trace doesn't contain any immediately suspicious functions. Logs also don't show any errors.How I disabled signal handling / cleanup code
This is just an educated guess, so it might be wrong.
The text was updated successfully, but these errors were encountered: