Manager Agent Gym

A research platform for developing and evaluating autonomous Manager Agents that orchestrate complex workflows involving both human and AI collaborators

📚 Online Docs: deepflow-research.github.io/manager_agent_gym

🎯 Overview

This repository contains the research codebase and reference implementation for autonomous Manager Agents that orchestrate complex workflows with human and AI collaborators, as described in our recent paper "Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge" published in DAI 2025. For complete documentation, head to the docs below.

🏁 Run the Benchmark

Quick way to run the benchmark suite across workflow scenarios using the CLI.

# Activate uv virtualenv (create it first if needed: `uv venv`)
source .venv/bin/activate

# From repo root, launch the interactive runner
python -m examples.cli

# Tip: non-interactive example
# python -m examples.cli --non-interactive --manager-mode cot --model-name o3 --max-timesteps 50

Outputs are written under directories like simulation_outputs_cot_rerun/, simulation_outputs_random_rerun/, etc., grouped by model.

The CLI entrypoint lives at examples/cli.py.

🧩 Key Concepts

worker: A workflow-executing agent that performs tasks and produces resources. In code these implement AgentInterface (see manager_agent_gym/core/workflow_agents/interface.py). Workers can represent simulated humans or tool-using AIs.
manager: The decision-making agent that observes the workflow each timestep and issues actions (e.g., assign, split, refine, message). See manager actions in manager_agent_gym/schemas/execution/manager_actions.py and manager agents under manager_agent_gym/core/manager_agent/.
task: An atomic or composite unit of work with dependencies and inputs/outputs, modeled by Task (manager_agent_gym/schemas/core/tasks.py).
resource: A digital artifact produced/consumed by tasks (documents, datasets, code), modeled by Resource (manager_agent_gym/schemas/core/resources.py).
workflow: The container holding tasks, agents, resources, constraints, and messages; evolves over discrete timesteps. Modeled by Workflow (manager_agent_gym/schemas/core/workflow.py).
stakeholder: The persona owning preferences and providing feedback/approvals; exposed to the manager via a public profile. See StakeholderBase/StakeholderConfig (manager_agent_gym/core/workflow_agents/interface.py, manager_agent_gym/schemas/workflow_agents/stakeholder.py).

🚀 Your First Manager Agent

The easiest way to launch a working manager agent is the hello_manager_agent.py example. Run:

python examples/getting_started/hello_manager_agent.py

That script builds an ICAAP workflow, registers the agents, and executes a ChainOfThoughtManagerAgent using your configured model (default gpt-4o-mini).

📚 Documentation & Resources

Online Docs: https://deepflow-research.github.io/manager_agent_gym
Repository Home: This repository
Docs Home: docs/index.md
Quick Start: Quick Start Guide
Library Guide: Library Documentation
Technical Architecture: Technical Architecture
Research Paper (PDF): Orchestrating Human-AI Teams (PDF)

🧪 Examples & Workflows

Browse examples: examples/
Getting started walkthrough: examples/getting_started/README.md
End-to-end demos: examples/end_to_end_examples/

📝 License

MIT License — see LICENSE.

📖 Citation

If you use Manager Agent Gym in your work, please cite the accompanying paper:

Charlie Masters, Advaith Vellanki, Jiangbo Shangguan, Bart Kultys, Alastair Moore, Stefano V. Albrecht. "Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge." In Proceedings of the International Conference on Distributed Artificial Intelligence (DAI 2025), London, United Kingdom. Available at https://www.arxiv.org/abs/2510.02557.

@inproceedings{manager_agent_gym_2025,
  title     = {Orchestrating Human-AI Teams: The Manager Agent as a Unifying Research Challenge},
  author    = {Masters, Charlie and Vellanki, Advaith and Shangguan, Jiangbo and Kultys, Bart and Moore, Alastair and Albrecht, Stefano V.},
  booktitle = {Proceedings of the International Conference on Distributed Artificial Intelligence (DAI 2025)},
  year      = {2025},
  address   = {London, United Kingdom},
  note      = {Manager Agent Gym},
  url       = {(https://www.arxiv.org/abs/2510.02557)}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
manager_agent_gym		manager_agent_gym
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LIBRARY_DOCUMENTATION.md		LIBRARY_DOCUMENTATION.md
LICENSE		LICENSE
QUICK_START_GUIDE.md		QUICK_START_GUIDE.md
README.md		README.md
TECHNICAL_ARCHITECTURE.md		TECHNICAL_ARCHITECTURE.md
docker-compose.yaml		docker-compose.yaml
mkdocs.yml		mkdocs.yml
nixpacks.toml		nixpacks.toml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Manager Agent Gym

🎯 Overview

🏁 Run the Benchmark

🧩 Key Concepts

🚀 Your First Manager Agent

📚 Documentation & Resources

🧪 Examples & Workflows

📝 License

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

DeepFlow-research/manager_agent_gym

Folders and files

Latest commit

History

Repository files navigation

Manager Agent Gym

🎯 Overview

🏁 Run the Benchmark

🧩 Key Concepts

🚀 Your First Manager Agent

📚 Documentation & Resources

🧪 Examples & Workflows

📝 License

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages