-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Hi,
In JGroups-RAFT, many of the synchronous methods internally call their async counterparts in a blocking way. This is fine in general — for example, if the leader or quorum isn’t available, a TimeoutException can be thrown, and it’s appropriate for the application to handle that.
Problem
One place where this becomes problematic is in RaftSyncCounter.set(). This method internally calls asyncSet(), but blocks using CompletableFutures.join(), which wraps any underlying exceptions into a RuntimeException without exposing the root cause directly.
This differs from other methods in JGroups-RAFT that are allowed to declare throws Exception and thus let callers handle expected distributed failures (like leader unavailability).
The issue is that RaftSyncCounter implements the SyncCounter interface, and SyncCounter.set() does not declare any checked exceptions. As a result:
-
RaftSyncCounter.set() cannot throw a TimeoutException or similar checked exceptions
-
All failures get wrapped in a RuntimeException, making transient failures look catastrophic
-
Applications can’t easily distinguish between a temporary leader election and a genuine bug
This is especially concerning because leader elections or temporary unavailability are normal in RAFT, and shouldn’t cause the app to crash or misinterpret the failure.
I’m not sure if the use of join() here was intentional under the assumption that updating counter should always succeed while the cluster is up (in JGroups context) — but in the RAFT context, temporary unavailability is expected and shouldn’t be treated as fatal.
Solution
Possible solutions could include:
- Stop implementing SyncCounter in RaftSyncCounter, allowing it to declare "throws Exception" like other methods in JGroups-RAFT.
- Or modifying SyncCounter to declare checked exceptions on set() (but this is outside the scope of JGroups-RAFT).
Thanks for considering this issue. I’d be interested in your thoughts on whether this behavior was intentional, and what direction you’d prefer for a fix. I’m happy to implement the change if a solution is agreed upon.