Skip to content
/ episim Public

Transform epidemic modeling research papers into interactive public health simulators. Powered by Claude Opus 4.6 — 1M context, extended thinking, 128K output.

License

Notifications You must be signed in to change notification settings

wpn10/episim

Repository files navigation

EpiSim

Research papers in, interactive simulators out.

Drop a PDF or paste an arxiv ID. EpiSim reads the paper, extracts the complete mathematical model, generates an interactive simulator with parameter sliders, validates it against the paper's own results, and produces a downloadable standalone script. Six AI agents. One pipeline. Zero code written by the user.

Built for epidemiology. Works on any ODE paper.

SEIQR Quarantine Model — Generated by EpiSim Example output: SEIQR quarantine model extracted from a research paper and simulated automatically.

Built with Claude Opus 4.6 Python 3.10+ License: MIT 76 Tests Passing

Problem Statement 2: Break the Barriers -- Mathematical modeling is locked behind programming expertise. EpiSim puts it in everyone's hands.


The Problem

During COVID-19, thousands of epidemic modeling papers were published. Each contained mathematical models that could inform public health decisions -- if anyone could run them. But reproducing a paper's model requires reading dense mathematics, implementing ODE systems in Python, calibrating parameters, and debugging numerical solvers. This takes a trained computational scientist days of work per paper.

The same barrier exists across science. Ecology papers describe predator-prey dynamics no one simulates. Pharmacology papers model drug interactions no clinician explores interactively. The knowledge exists in papers. The barrier is implementation.

EpiSim removes that barrier entirely. Upload the paper, get an interactive simulator and downloadable code.


What It Does

Input: A research paper (PDF upload or arxiv ID like 2003.09861)

Output: Three things, automatically:

  1. Plain-English Summary -- Paper digest with key findings, methodology, limitations, and public health implications. Written for non-specialists.

  2. Interactive Simulator -- Real-time dynamic curves with parameter sliders. Drag a slider, watch the curves change instantly. See peak timing, peak magnitude, and key metrics update live.

  3. Standalone Script -- A clean, downloadable Python file (numpy/scipy/matplotlib only) that reproduces the paper's model. Runnable with python script.py, no dependencies beyond standard scientific Python.

End-to-End Pipeline

1. Paste arxiv ID: 2003.09861
2. AI reads the 30-page paper (extended thinking, ~60 seconds)
3. Extracts: 8 compartments, 16 parameters, full ODE system
4. Three agents run IN PARALLEL: Summarizer + Builder + Coder
5. Validator checks results against the paper's reported metrics
6. If any metric deviates >5%: Debugger patches and retries (up to 3x)
7. Three-tab result: Summary | Simulation | Code

Tested Papers

Paper Model Domain Compartments Status
Giordano et al. 2020 SIDARTHE COVID-19 8 Working
Ghosh & Bhattacharya 2020 SEIQR COVID-19 5 Working
Hridoy & Mustaquim 2024 SEIR Seasonal Dengue 4 Working
Lotka-Volterra dynamics Predator-Prey Ecology 2 Working
Rachah & Torres 2017 SEIR Ebola 4 Working

How We Push Opus 4.6

This section details how EpiSim uses capabilities exclusive to Claude Opus 4.6 -- features no other model offers.

Adaptive Thinking with Per-Agent Effort Levels

Opus 4.6 introduced adaptive thinking -- the model decides how deeply to reason based on task complexity, guided by an effort parameter (low, medium, high, max). EpiSim is the first application to use different effort levels for different agents in the same pipeline:

Agent Effort Why
Reader max Extracting ODEs from a 30-page paper requires deep mathematical reasoning. No shortcuts.
Summarizer medium Summarization needs clarity, not mathematical depth. Faster response.
Builder high Code generation benefits from reasoning but doesn't need max depth.
Coder high Standalone script needs good structure, not exhaustive analysis.
Debugger high Bug diagnosis needs reasoning about code + math simultaneously.

This demonstrates fine-grained reasoning control: the same model, tuned per task, running in parallel. The Reader thinks for 60 seconds at max effort while the Summarizer finishes in 15 seconds at medium effort -- same API, same model, different cognitive allocation.

Visible Thinking ("Thinking Out Loud")

The Reader Agent's extended thinking streams into the Streamlit UI in real-time. Users watch Opus 4.6 reason through the paper:

  • Phase detection classifies thinking into 7 stages (Reading Paper, Identifying Compartments, Extracting Parameters, Formulating ODE System, Setting Initial Conditions, Cross-referencing Results, Synthesizing Model)
  • A dark typewriter console shows the current reasoning with phase transitions highlighted
  • During parallel execution, the thinking replays with rotating excerpts so users stay engaged
  • After pipeline completes, full thinking is available in an expandable console on the results page

This builds trust. When the model writes "I see 8 compartments: S, I, D, A, R, T, H, E" and then produces code with exactly those compartments, users verify the reasoning chain.

1M Context Window (No RAG Needed)

The Reader Agent receives the entire paper plus a knowledge base of model formulations, parameter ranges, and ODE solver best practices -- all in a single prompt. No chunking. No retrieval. No information loss.

This matters because mathematical models are defined across multiple sections of a paper. The ODE system is in Section 3, but the parameter values are in Table 2, the initial conditions in the supplementary material, and the validation targets in Figure 5. Chunking would lose these cross-references. The 1M context window holds everything.

128K Output Tokens (One-Shot Code Generation)

The Builder Agent generates a complete Streamlit application -- model.py, solver.py, app.py, config.json, and requirements.txt -- in a single API call. No multi-turn generation, no template stitching.

This produces coherent code where the solver imports from the model, the app imports from the solver, and the config matches both. Fragmented generation produces fragmented code.

Parallel Multi-Agent Execution

After the Reader completes, three agents run simultaneously via ThreadPoolExecutor:

Reader (max) --> [ Summarizer(medium) | Builder(high) | Coder(high) ] --> Validator --> Debugger
                         PARALLEL via ThreadPoolExecutor                    SEQUENTIAL

Each agent creates its own Anthropic client, makes its own API call at its own effort level, and returns independently. The UI shows live agent status badges with effort labels, updating as each completes.

Structured Output via Tool Use

Every agent returns data through Pydantic v2 schemas enforced via tool_use. The EpidemicModel schema is the contract between all agents -- 9 fields, strictly typed, validated on both send and receive. No free-text parsing. No regex extraction. Type-safe inter-agent communication.


Architecture

Paper (PDF/arxiv)
    |
    v
Paper Loader --> Context Builder --> Reader Agent (adaptive, max effort)
                                         |
                        +----------------+----------------+
                        |                |                |
                   Summarizer       Builder           Coder
                   (medium)         (high)            (high)
                        |                |                |
                        v                v                v
                   PaperSummary    Simulator Files   Standalone Script
                                         |
                                    Validator
                                         |
                                    Debugger (if needed, high effort)
                                         |
                                    3-Tab Streamlit UI
Component File Opus 4.6 Feature
Reader agents/reader.py Adaptive thinking (max), 1M context, tool use
Summarizer agents/summarizer.py Adaptive thinking (medium), tool use
Builder agents/builder.py Adaptive thinking (high), 128K output, tool use
Validator agents/validator.py Pure Python -- subprocess execution, metric comparison
Debugger agents/debugger.py Adaptive thinking (high), code analysis
Coder agents/coder.py Adaptive thinking (high), tool use
Thinking Stream core/thinking_stream.py Real-time thinking display with phase classification

Iteration Journey

v1 -- Basic Pipeline (commits 4746ce6 to f0ad239) Scaffolding, schemas, paper loader, agents wired sequentially. Reader used fixed budget_tokens. No UI. CLI only. Worked for SIR but failed on complex models.

v2 -- Streamlit App + Validation Loop (commits 12c9b4a to fc4e0ba) Added the Streamlit interface, the Validator + Debugger self-healing loop, and fixed R0 formula handling for complex models (SIDARTHE has 8 compartments -- the simple beta/gamma formula doesn't apply). 76 tests covering SIR, SEIR, pipeline integration, and edge cases.

v3 -- Three-Tab UI + New Agents (commit 8a8e3c1) Added Summarizer and Coder agents. Rewrote the app with Summary | Simulation | Code tabs. Realized the demo needed more than just charts -- judges want to see AI understanding, not just AI generating.

v4 -- Dark Scientific Theme (commit 1882fd9) Complete visual redesign. Custom typography (Fraunces + Outfit + JetBrains Mono), dark blue-black palette with amber accents, glass-morphism cards, staggered animations. The UI went from "default Streamlit" to "scientific intelligence platform."

v5 -- Thinking Out Loud + Parallel Agents (commits 4800bba to 51a0ba1) The breakthrough iteration. Switched from post-hoc thinking display to real-time streaming of thinking blocks into the UI. Added adaptive thinking with per-agent effort levels. Parallelized Summarizer + Builder + Coder for ~40% speed improvement. Added the typewriter console with phase classification, replay mode during parallel execution, and persistent thinking display on results page.

v6 -- Beyond Epidemiology (commit 51a0ba1+) Discovered the pipeline is domain-agnostic. Tested on a Lotka-Volterra predator-prey ecology paper -- the Reader extracted the ODE system, the Builder generated a working simulator, no code changes needed. The architecture already reads any differential equation paper, not just epidemic ones. EpiSim started as epidemic modeling. It became a universal paper-to-simulator engine.

The key insight: showing the AI's reasoning process isn't just a demo trick -- it's a trust mechanism. When users can watch the model identify variables, extract parameters, and formulate ODEs step by step, they trust the output regardless of the domain.


Beyond Epidemiology

EpiSim's pipeline -- read paper, extract math, generate simulator, validate, self-heal -- is domain-agnostic. Any field that publishes differential equation models in academic papers works today, with zero code changes:

Domain Tested Paper What the Simulator Does
Epidemiology SIDARTHE COVID-19 8-compartment epidemic curves with intervention parameter sliders
Epidemiology SEIR Dengue Seasonal dengue dynamics with transmission rate controls
Ecology Predator-Prey dynamics Lotka-Volterra population cycles with predation rate sliders

The Reader Agent doesn't know what domain it's reading. It knows it's reading differential equations. The same 1M context window that extracts a COVID-19 SIDARTHE model extracts a predator-prey Lotka-Volterra model -- same pipeline, same agents, same output format.

Opus 4.6's extended thinking at max effort reasons through any dense academic paper -- epidemiology, ecology, pharmacokinetics, climate science, neuroscience. The architecture already supports it. The knowledge base is the only epidemic-specific component, and it's optional context, not a hard dependency.


Quick Start

git clone https://github.com/wpn10/episim.git
cd episim
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
export ANTHROPIC_API_KEY=your_key_here

Web App

streamlit run app.py

Upload a PDF or paste an arxiv ID (e.g. 2003.09861) and click Generate Simulator.

CLI

python -m episim.core.orchestrator --paper 2003.09861

Demo Video

Suggested demo flow:

  1. Paste 2003.09861 (SIDARTHE COVID-19 model) into the sidebar
  2. Watch the thinking console stream the Reader's reasoning in real-time
  3. See parallel agents execute with effort-level badges
  4. Explore the Summary tab (paper digest)
  5. Drag parameter sliders on the Simulation tab, watch curves update
  6. Download the standalone script from the Code tab

Tests

pytest tests/ -v

76 tests across 13 files: schema validation, PDF extraction, SIR/SEIR ODE solvers, pipeline integration, mocked agent tests, edge cases. All passing.


Tech Stack

Python 3.10+ | Anthropic API (Opus 4.6) | scipy | Streamlit | Plotly | PyMuPDF | Pydantic v2


License

MIT

Built for the "Built with Opus 4.6" Claude Code Hackathon, February 2026.

About

Transform epidemic modeling research papers into interactive public health simulators. Powered by Claude Opus 4.6 — 1M context, extended thinking, 128K output.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages