-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking the performance of JupyterWgpuCanvas
#378
Comments
Thanks for taking the time to do these kinds of benchmarks!
The Lines 178 to 182 in 2ffacd9
No, because we already have To explain a bit about If the server just sends every frame as soon as it can, then from the pov of the server things look fast, but it will put a lot of strain on the connection, which causes lag at the client, can even reduce the fps at the client, and eventually clog the connection. So what Jupyter_rfb does, is that for each frame that the client receives, the client sends a confirmation to the server. The That dip with To benchmark this stuff effectively, I think we'd need three measures:
I've never done a thorough benchmark myself. I've only tried several values at different types of connections. I suppose the FPS limit is the major knob to turn. I expect reasonable values for |
I created pygfx/rendercanvas#40. I also included a list of steps with some details / options. |
Let's continue the discussion there (even though it may require changes in the canvas context here). |
I went down a rabbit hole of testing out vispy/jupyter_rfb#76 , and figuring out why the
JupyterWgpuCanvas
could never seem to exceed 30fps.simplejpeg
is very fast, encoding is in the range of a few milliseconds, or less than 1 ms for encodingastronaut.png
. Increasingwidget.max_buffered_frames
worked for basicRemoteFrameBuffer
, but made no difference with theWgpuCanvas
. So I found this:wgpu-py/wgpu/gui/jupyter.py
Lines 86 to 89 in 2ffacd9
Dividing the
draw_wait_time
on L89 by 2 seems to increase performance! The framerate is double with small canvases around 512x512, and significantly higher for larger canvases.I added a
JupyterWgpuCanvas.delay_divisor
which divides thedraw_wait_time
in thecall_later
call to test things:Ran these on a Radeon RX 570 (old GPU), will test on a more modern GPU later today.
This gives me 54fps:
With
delay_divisor = 2
, the lag is barely perceptible during interaction, if it's increased to 4 the lag becomes very noticeable and the framerate barely increases. Thepygfx
controller also has dampening which probably helps to reduce effects of lag (if present) withdelay_divisor = 2
.delay_divisor = 1:
d1.mp4
delay_divisor = 2:
d2.mp4
delay_divisor = 4, lag is very obvious:
d4.mp4
I benchmarked this with a larger canvas at 1700x900 and got this, which seems to suggest that dividing the delay by 2 and buffering 10-20 frames gives the best performance. If there's a way to measure input lag that would be nice to factor in as well!
Benchmarking code:
Side note: the drop in fps with ~7 buffered frames is really odd.
Questions before doing a PR:
JupyterWgpuCanvas._get_draw_wait_time()
from a round-trip measure, which is why dividing it by 2 increases the performance?delay_divisor
as a@property
toJupyterWgpuCanvas
? Or is there some other caveat? The only drawback I found is input lag if thedelay_divisor
is large, but the benchmarks seem to show that increasing it beyond 2 doesn't increase performance anyways.Now this makes me wonder what is the real bottleneck when the canvas is very large, like near 4k. Simplejpeg starts to slow down at these resolutions, might look into nvjpeg.
The text was updated successfully, but these errors were encountered: