You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Faster reconnect on handshake required response (#4772)
When orchestrator restarts, compute nodes wait for 5 failed heartbeats
(~75s) before attempting to reconnect, even though orchestrator
immediately returns "Handshake required" errors.
Modify compute nodes to detect this specific error and trigger immediate
reconnection, rather than waiting for the heartbeat failure threshold.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced error handling for heartbeat operations, specifically
addressing handshake requirements.
- New boolean field `HandshakeRequired` added to track handshake
necessity.
- **Bug Fixes**
- Improved robustness of connection health monitoring by incorporating
handshake checks.
- **Tests**
- Added tests for new handshake handling scenarios in both
`ControlPlaneTestSuite` and `ConnectionManagerTestSuite`.
- Enhanced coverage for `HealthTracker` functionality regarding
handshake states.
- **Documentation**
- Updated comments in connection health checks for clarity on new
criteria.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
0 commit comments