Skip to content

Conversation

@shayne-fletcher
Copy link
Contributor

Summary:
the entire controller health monitoring system is now dead code. in monarch_hyperactor/src/proc.rs, the supervision check loop was already stubbed out to just log and do nothing. the struct fields controller_id, last_controller_status_check, controller_error_sender, and controller_error_receiver were initialized but never meaningfully used. the set_controller() method was only called from the deprecated ClientActor::attach() code path. the ControllerError enum was never actually raised since the only code path that could raise it was inside the stubbed-out supervision check.

in monarch_extension/src/client.rs, the error handling for ControllerError was unreachable code since that error was never raised. the set_controller() call in ClientActor::attach() was part of this deprecated flow.

this diff deletes all of this supervision code. the ensure_detached_and_alive() method is simplified to only check signals. all controller-related struct fields are removed along with the ControllerError enum definition.

Differential Revision: D88441603

Summary:

decouple monarch_hyperactor from hyperactor_multiprocess.

move the hyperactor_multiprocess-dependent proc `bootstrap` and `world status` helpers out of monarch_hyperactor into monarch_extension.

introduce a new `monarch_extension::proc` module (exported as `monarch._rust_bindings.proc`) that provides `init_proc` and `world_status` using hyperactor_multiprocess. update rust_backend_mesh.py to import those functions from the new module.

as a result monarch_hyperactor drops its hyperactor_multiprocess dependency, pickled client actors expose `instance_arc()` for the new `world_status` helper, and legacy controller supervision polling is removed from `monarch_hyperactor::proc` since it depended on the system actor and is only used by the old rust_backend_mesh path.

Reviewed By: vidhyav, dulinriley

Differential Revision: D88417111
Summary:
the entire controller health monitoring system is now dead code. in monarch_hyperactor/src/proc.rs, the supervision check loop was already stubbed out to just log and do nothing. the struct fields `controller_id`, `last_controller_status_check`, `controller_error_sender`, and `controller_error_receiver` were initialized but never meaningfully used. the `set_controller()` method was only called from the deprecated `ClientActor::attach()` code path. the `ControllerError` enum was never actually raised since the only code path that could raise it was inside the stubbed-out supervision check.

in monarch_extension/src/client.rs, the error handling for `ControllerError` was unreachable code since that error was never raised. the `set_controller()` call in `ClientActor::attach()` was part of this deprecated flow.

this diff deletes all of this supervision code. the `ensure_detached_and_alive()` method is simplified to only check signals. all controller-related struct fields are removed along with the `ControllerError` enum definition.

Differential Revision: D88441603
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 5, 2025
@meta-codesync
Copy link

meta-codesync bot commented Dec 5, 2025

@shayne-fletcher has exported this pull request. If you are a Meta employee, you can view the originating Diff in D88441603.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant