: remove controller health monitoring #2058
Open
+271
−326
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
the entire controller health monitoring system is now dead code. in monarch_hyperactor/src/proc.rs, the supervision check loop was already stubbed out to just log and do nothing. the struct fields
controller_id,last_controller_status_check,controller_error_sender, andcontroller_error_receiverwere initialized but never meaningfully used. theset_controller()method was only called from the deprecatedClientActor::attach()code path. theControllerErrorenum was never actually raised since the only code path that could raise it was inside the stubbed-out supervision check.in monarch_extension/src/client.rs, the error handling for
ControllerErrorwas unreachable code since that error was never raised. theset_controller()call inClientActor::attach()was part of this deprecated flow.this diff deletes all of this supervision code. the
ensure_detached_and_alive()method is simplified to only check signals. all controller-related struct fields are removed along with theControllerErrorenum definition.Differential Revision: D88441603