DAI 2025 Tutorial - Multi-Agent Systems Coding Demo

This repository contains progressive coding examples demonstrating how manager agents become essential as system complexity increases in multi-agent systems.

Part of the tutorial LLM-based Multi-Agent Systems: Foundations and Practice given by the DeepFlow team at the Distributed AI Conference in London, November 2025.

Overview

Five progressive examples showing the evolution from a simple single-agent system to a sophisticated multi-agent system with manager coordination:

Example 0: Base case - single agent audits one record
Example 1: Hierarchical decomposition - manager coordinates multiple worker agents (3-5 workers)
Example 2: Ad hoc teaming - pharmacist specialist joins mid-audit
Example 3: Multi-objective preferences - event-driven priority adaptation with crisis response
Example 4: Safety & governance - human-in-the-loop for dangerous actions (⚠️ includes dangerous tool demonstration)

Domain: Healthcare Medication Safety Audit

The examples use a hospital medication administration audit system that evolves as real-world constraints reveal failures requiring increasingly complex management.

Setup

Prerequisites

Python 3.13+
uv package manager (recommended) or pip

Installation

Install dependencies:

uv sync
# or
pip install -e .

Set up environment variables:

export ANTHROPIC_API_KEY=your_api_key_here
# or set LITELLM_API_KEY if using LiteLLM proxy

Running Examples

Each example can be run independently:

# Example 0: Base case
python -m src.examples.example_0.main

# Example 1: Hierarchical decomposition
python -m src.examples.example_1.main

# Example 2: Ad hoc teaming
python -m src.examples.example_2.main

# Example 3: Multi-objective preferences
python -m src.examples.example_3.main

# Example 4: Safety & governance with human-in-the-loop ⚠️
python -m src.examples.example_4.main

Project Structure

coding_demo_examples/
├── src/
│   ├── examples/                        # Example implementations
│   │   ├── example_0/                   # Base case
│   │   │   ├── README.md                # Detailed explanation
│   │   │   ├── agents.py                # Agent definitions
│   │   │   ├── consts.py                # Constants (TITLE, TASK, etc.)
│   │   │   └── main.py                  # Execution logic
│   │   ├── example_1/                   # Hierarchical decomposition
│   │   ├── example_2/                   # Ad hoc teaming
│   │   ├── example_3/                   # Multi-objective preferences
│   │   │   ├── tools/                   # Example-3 specific tools
│   │   │   │   ├── crisis_wrapper.py    # Crisis detection wrapper
│   │   │   │   └── planning.py          # Crisis-aware planning tools
│   │   │   └── resources/               # Example-3 specific resources
│   │   │       └── audit_context.py     # Shared AuditContext
│   │   └── example_4/                   # Safety & governance (human-in-the-loop)
│   │       └── data/                    # Example-4 specific mock data
│   │           ├── example_4_medication_records.json  # Safety-critical scenarios
│   │           ├── example_4_prescriptions.py         # Prescription discrepancies
│   │           └── example_4_patients.py              # Patient clinical context
│   ├── core/                            # Shared utilities
│   │   ├── agent_utils/                 # Agent creation utilities
│   │   │   ├── base.py                  # create_agent, create_manager_agent
│   │   │   ├── roles.py                 # Role-based tool assignment
│   │   │   └── streaming.py             # Streaming output utilities
│   │   ├── tools/                       # Shared tool implementations
│   │   │   ├── planning.py              # Core planning tools
│   │   │   ├── medication_orders.py     # ⚠️ Dangerous tool: propose medication changes
│   │   │   ├── medication_records.py    # Medication record access
│   │   │   ├── patient_data.py          # Patient information
│   │   │   ├── prescriptions.py         # Prescription verification
│   │   │   ├── administration.py        # Administration timing
│   │   │   ├── inventory.py             # Medication inventory
│   │   │   ├── lab_results.py           # Lab results access
│   │   │   ├── compliance_rules.py      # Compliance checking
│   │   │   ├── audit_reporting.py       # Audit reporting
│   │   │   └── red_herring/             # Irrelevant tools (for testing)
│   │   │       ├── scheduling.py        # Staff scheduling
│   │   │       ├── billing.py           # Billing information
│   │   │       └── ward_management.py    # Ward capacity
│   │   └── resources/                    # Shared resources
│   │       └── events.py                # Event simulations
│   ├── pyproject.toml                   # Project dependencies
│   └── README.md                        # This file

See src/examples/README.md for detailed documentation of each example.

Key Technologies

Framework: OpenAI Agents SDK (openai-agents>=0.5.1)
Model: Claude 4.5 Haiku via LiteLLM (LitellmModel)
Language: Python 3.13+
Type Safety: Pydantic models for all inputs/outputs

Narrative Flow

Each example builds on the previous, revealing limitations that drive the progression:

Example 0 → 1: Scale failure - single agent can't handle volume (5 records)
Example 1 → 2: Static team failure - can't integrate new specialists mid-execution
Example 2 → 3: Preference failure - can't balance competing objectives or adapt to crises
Example 3 → 4: Safety failure - agents want to take dangerous actions requiring human oversight

⚠️ Human-in-the-Loop Demonstration (Example 4)

Example 4 includes a dangerous tool demonstration showing why AI agents need human oversight in high-stakes domains:

The Tool: submit_medication_change_order - allows agents to propose medication changes
Why Dangerous: Direct patient safety impact; AI lacks full clinical context
The Safety: ALL orders blocked pending human approval; complete audit trail
The Lesson: AI + Human oversight safer than either alone

See src/examples/example_4/README.md for the full scenario and src/core/tools/medication_orders.py for implementation.

See src/examples/README.md for detailed progression and learnings.

Notes

All functions use Pydantic models for type safety (no dict returns)
Mock data is used for demonstration purposes
Examples are designed for tutorial presentation with clear progression
Each example includes comments explaining the scenario and limitations

Runtime Issues and Fixes

See RUNTIME_ISSUES_AND_FIXES.md for documentation of runtime issues encountered during development, their fixes, and recommendations for production use.

License

This code is for educational purposes as part of the DAI 2025 Tutorial.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
img		img
src		src
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Procfile		Procfile
README.md		README.md
RUNTIME_ISSUES_AND_FIXES.md		RUNTIME_ISSUES_AND_FIXES.md
nixpacks.toml		nixpacks.toml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DAI 2025 Tutorial - Multi-Agent Systems Coding Demo

Overview

Domain: Healthcare Medication Safety Audit

Setup

Prerequisites

Installation

Running Examples

Project Structure

Key Technologies

Narrative Flow

⚠️ Human-in-the-Loop Demonstration (Example 4)

Notes

Runtime Issues and Fixes

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

DeepFlow-research/DAI-MAS-tutorial-2025

Folders and files

Latest commit

History

Repository files navigation

DAI 2025 Tutorial - Multi-Agent Systems Coding Demo

Overview

Domain: Healthcare Medication Safety Audit

Setup

Prerequisites

Installation

Running Examples

Project Structure

Key Technologies

Narrative Flow

⚠️ Human-in-the-Loop Demonstration (Example 4)

Notes

Runtime Issues and Fixes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages