-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate more optimal way to implement CoreGraphics backend #83
Comments
http://russbishop.net/cross-process-rendering describes how it is possible to create an
|
Or possibly we could just use Edit: See https://developer.apple.com/documentation/coregraphics/cgdataproviderreleasedatacallback:
|
So comparing these:
For performance concerns, benchmarking is best. But we'd need a representative benchmark, an implementation of both, and multiple types of hardware. |
Oh, I forgot about buffer stride. Testing this (#95), it looks like we can't just set the stride to always match the width, so to use This would probably also be needed for #42. Or if we wanted to use dmabufs instead of shm on wayland, etc. |
Could be me, have a Mac right here with an AMD DGPU, as long as IOSurface exists on macOS 10.14. |
https://developer.apple.com/documentation/iosurface says it was introduced in macOS 10.6 (sorry PowerMac G5 users), so that much shouldn't be an issue. |
Great, I could proceed forward with:
And as a bonus, implementing it all the way back on macOS 10.14 would ensure that I should be free to do any of those in around an hour :) |
Reading up it looks like you're talking about having to expose a stride, let me introduce: |
#95 has an implementation using IOSurface. Which requires an API change to expose stride. And it updates the I wonder if there's a good way to automate benchmarking of softbuffer performance. |
I'll check them out. I don't have an M1 to test with, but if you do, that should cover everything. My benchmark method typically tends to be instrumentation using Once I have some thoughts I'll leave them on the relevant PR, or here if they affect both or are in general. |
Alright, so based on my testing, for total render times:
I think the 16ms might be a fluke here, it makes you think it might be vsync but it's consistently lower than 16ms for small windows and consistently higher than 16ms for larger-than-screen windows. In fullscreen, it doesn't seem to ever take longer than 18ms or so, but this is still beat by Also, Here are some more detailed breakdowns per-branch:
Now Wait Just A Minute, there's something fishy here. Let's see:
This makes me wonder if IOSurface is somehow magical! The memory backing it seems to somehow be more expensive than normal memory, perhaps it's some sort of MMIO or something. Anyway, this prompted me to do some more testing. My method of filling buffers quickly is to use
Much better? As far as I can tell, |
ideally these are also tested on apple silicon, to see how it behaves with the unified gpu memory |
Of course, I was assuming that @ids1024 (or someone else) would get back to me with comparisons on ASi to see if iosurface-wip really is the best choice for both, but it seems like that hasn't happened yet. |
what's the easiest way to repro your test? |
instrument the code with some |
On the |
Apparently macOS (and iOS, #43) has a framework called
IOSurface
for exchanging framebuffers and textures between processes, which sounds similar to the idea behind dmabufs on Linux. I think we should be useIOSurfaces
for a front and back buffer, and useIOSurfaceGetBaseAddress
to get a pointer to write into for no-copy presentation (#65)? Assuming it can work with the right pixel format.Or are there issues with this, or a better way?
The text was updated successfully, but these errors were encountered: