Supervisor: Retry on "Address already in use", and/or implement JEP 66 #5799
Labels
area:kallichore
Issues related to the new kernel supervisor
area: kernels
Issues related to Jupyter kernels and LSP servers
There is a race condition inherent in the Jupyter protocol; because there is a gap between the time ZeroMQ socket ports are selected by the client and the time they are bound by the server, another process can bind to the ports before the server does. The result is a ZeroMQ error that looks like this when the server tries to bind to the port:
Here's a screenshot of this error happening in Positron:
This race condition should be rare, and we have already taken steps to mitigate it -- for example, the kernel supervisor already keeps track of ports that are "reserved" by kernels that have not started yet. However, there is still a small chance that any kernel startup will result in this error, and the chance is higher during automated tests since a lot of startups happen quickly.
To address this, we could:
The text was updated successfully, but these errors were encountered: