[Bug]: Race condition during DKG #929
Labels
bug
Something isn't working
signer communication
Communication across sBTC bootstrap signers.
signer coordination
The actions executed by the signer coordinator.
Milestone
Bug - Race condition during DKG
1. Description
During distributed key generation (DKG) there is a possibility of a race condition where some signers successfully finish DKG while others do not. This can happen in some network topologies when some signers are much further from the coordinator.
1.1 Context & Purpose
We've been assuming that in scenarios where DKG doesn't finish successfully, the signers would just try again when the next bitcoin block arrived. Unfortunately, there seems to be cases where DKG finishes successfully for some but now for everyone.
2. Technical Details:
One proposed fix, suggested by @xoloki, was to coordinate DKG in the same way that signing rounds are handled. During signing rounds, such race conditions are not possible because of the signing protocol structure. So, structuring DKG similarly could solve this issue.
2.1 Acceptance Criteria:
3. Related Issues and Pull Requests (optional):
There are two draft PRs for fixing this, one in the signers and another in WSTS. They are:
4. Addendum
My main "concern" between the current WSTS flow and the one proposed in the above draft PRs is the unknowns. I've looked through the WSTS code, but I do not have full picture of how the pieces fit together where I know what will happen if some assumption isn't upheld.
Also, @xoloki pointed out that some of the new message sizes are quite large and grow quickly. Their size is
O(n * k)
wheren
is the number of signers andk
is the number of keys. Butk
is generally linear inn
so this can be quite large.After looking through the proposed changes, they seem simple enough. I'd summarize them as "have the coordinator re-distribute everything after receiving it from everyone else". The coordinator was already a point source of failure before, so those changes don't change that assumption. I'm still a little nervous so at the very least I'd like to see some protocol diagrams for how things are supposed to work going forward. Let's use excalidraw for creating these diagrams. I believe that we should have these anyway for such a critical part of the sBTC protocol.
We briefly spoke about some alternatives to the draft PRs, let's flesh some of them out as well. It doesn't need to be in code, but we should at least have a description of them.
The text was updated successfully, but these errors were encountered: