Skip to content

DeepFlow-research/manager_agent_gym

Repository files navigation

Manager Agent Gym

MA-Gym Logo

A research platform for developing and evaluating autonomous Manager Agents that orchestrate complex workflows involving both human and AI collaborators

License: MIT Python 3.11+

📚 Online Docs: deepflow-research.github.io/manager_agent_gym

🎯 Overview

This repository contains the research codebase and reference implementation for autonomous Manager Agents that orchestrate complex workflows with human and AI collaborators, as described in our recent paper "Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge" published in DAI 2025. For complete documentation, head to the docs below.

🏁 Run the Benchmark

Quick way to run the benchmark suite across workflow scenarios using the CLI.

# Activate uv virtualenv (create it first if needed: `uv venv`)
source .venv/bin/activate

# From repo root, launch the interactive runner
python -m examples.cli

# Tip: non-interactive example
# python -m examples.cli --non-interactive --manager-mode cot --model-name o3 --max-timesteps 50

Outputs are written under directories like simulation_outputs_cot_rerun/, simulation_outputs_random_rerun/, etc., grouped by model.

The CLI entrypoint lives at examples/cli.py.

🧩 Key Concepts

  • worker: A workflow-executing agent that performs tasks and produces resources. In code these implement AgentInterface (see manager_agent_gym/core/workflow_agents/interface.py). Workers can represent simulated humans or tool-using AIs.
  • manager: The decision-making agent that observes the workflow each timestep and issues actions (e.g., assign, split, refine, message). See manager actions in manager_agent_gym/schemas/execution/manager_actions.py and manager agents under manager_agent_gym/core/manager_agent/.
  • task: An atomic or composite unit of work with dependencies and inputs/outputs, modeled by Task (manager_agent_gym/schemas/core/tasks.py).
  • resource: A digital artifact produced/consumed by tasks (documents, datasets, code), modeled by Resource (manager_agent_gym/schemas/core/resources.py).
  • workflow: The container holding tasks, agents, resources, constraints, and messages; evolves over discrete timesteps. Modeled by Workflow (manager_agent_gym/schemas/core/workflow.py).
  • stakeholder: The persona owning preferences and providing feedback/approvals; exposed to the manager via a public profile. See StakeholderBase/StakeholderConfig (manager_agent_gym/core/workflow_agents/interface.py, manager_agent_gym/schemas/workflow_agents/stakeholder.py).

🚀 Your First Manager Agent

The easiest way to launch a working manager agent is the hello_manager_agent.py example. Run:

python examples/getting_started/hello_manager_agent.py

That script builds an ICAAP workflow, registers the agents, and executes a ChainOfThoughtManagerAgent using your configured model (default gpt-4o-mini).

📚 Documentation & Resources

🧪 Examples & Workflows

📝 License

MIT License — see LICENSE.

📖 Citation

If you use Manager Agent Gym in your work, please cite the accompanying paper:

Charlie Masters, Advaith Vellanki, Jiangbo Shangguan, Bart Kultys, Alastair Moore, Stefano V. Albrecht. "Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge." In Proceedings of the International Conference on Distributed Artificial Intelligence (DAI 2025), London, United Kingdom. Available at https://www.arxiv.org/abs/2510.02557.

@inproceedings{manager_agent_gym_2025,
  title     = {Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge},
  author    = {Masters, Charlie and Vellanki, Advaith and Shangguan, Jiangbo and Kultys, Bart and Moore, Alastair and Albrecht, Stefano V.},
  booktitle = {Proceedings of the International Conference on Distributed Artificial Intelligence (DAI 2025)},
  year      = {2025},
  address   = {London, United Kingdom},
  note      = {Manager Agent Gym},
  url       = {(https://www.arxiv.org/abs/2510.02557)}
}

About

A gym to make strong manager agents!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages