Drop a PDF or paste an arxiv ID. EpiSim reads the paper, extracts the complete mathematical model, generates an interactive simulator with parameter sliders, validates it against the paper's own results, and produces a downloadable standalone script. Six AI agents. One pipeline. Zero code written by the user.
Built for epidemiology. Works on any ODE paper.
Example output: SEIQR quarantine model extracted from a research paper and simulated automatically.
Problem Statement 2: Break the Barriers -- Mathematical modeling is locked behind programming expertise. EpiSim puts it in everyone's hands.
During COVID-19, thousands of epidemic modeling papers were published. Each contained mathematical models that could inform public health decisions -- if anyone could run them. But reproducing a paper's model requires reading dense mathematics, implementing ODE systems in Python, calibrating parameters, and debugging numerical solvers. This takes a trained computational scientist days of work per paper.
The same barrier exists across science. Ecology papers describe predator-prey dynamics no one simulates. Pharmacology papers model drug interactions no clinician explores interactively. The knowledge exists in papers. The barrier is implementation.
EpiSim removes that barrier entirely. Upload the paper, get an interactive simulator and downloadable code.
Input: A research paper (PDF upload or arxiv ID like 2003.09861)
Output: Three things, automatically:
-
Plain-English Summary -- Paper digest with key findings, methodology, limitations, and public health implications. Written for non-specialists.
-
Interactive Simulator -- Real-time dynamic curves with parameter sliders. Drag a slider, watch the curves change instantly. See peak timing, peak magnitude, and key metrics update live.
-
Standalone Script -- A clean, downloadable Python file (numpy/scipy/matplotlib only) that reproduces the paper's model. Runnable with
python script.py, no dependencies beyond standard scientific Python.
1. Paste arxiv ID: 2003.09861
2. AI reads the 30-page paper (extended thinking, ~60 seconds)
3. Extracts: 8 compartments, 16 parameters, full ODE system
4. Three agents run IN PARALLEL: Summarizer + Builder + Coder
5. Validator checks results against the paper's reported metrics
6. If any metric deviates >5%: Debugger patches and retries (up to 3x)
7. Three-tab result: Summary | Simulation | Code
| Paper | Model | Domain | Compartments | Status |
|---|---|---|---|---|
| Giordano et al. 2020 | SIDARTHE | COVID-19 | 8 | Working |
| Ghosh & Bhattacharya 2020 | SEIQR | COVID-19 | 5 | Working |
| Hridoy & Mustaquim 2024 | SEIR Seasonal | Dengue | 4 | Working |
| Lotka-Volterra dynamics | Predator-Prey | Ecology | 2 | Working |
| Rachah & Torres 2017 | SEIR | Ebola | 4 | Working |
This section details how EpiSim uses capabilities exclusive to Claude Opus 4.6 -- features no other model offers.
Opus 4.6 introduced adaptive thinking -- the model decides how deeply to reason based on task complexity, guided by an effort parameter (low, medium, high, max). EpiSim is the first application to use different effort levels for different agents in the same pipeline:
| Agent | Effort | Why |
|---|---|---|
| Reader | max |
Extracting ODEs from a 30-page paper requires deep mathematical reasoning. No shortcuts. |
| Summarizer | medium |
Summarization needs clarity, not mathematical depth. Faster response. |
| Builder | high |
Code generation benefits from reasoning but doesn't need max depth. |
| Coder | high |
Standalone script needs good structure, not exhaustive analysis. |
| Debugger | high |
Bug diagnosis needs reasoning about code + math simultaneously. |
This demonstrates fine-grained reasoning control: the same model, tuned per task, running in parallel. The Reader thinks for 60 seconds at max effort while the Summarizer finishes in 15 seconds at medium effort -- same API, same model, different cognitive allocation.
The Reader Agent's extended thinking streams into the Streamlit UI in real-time. Users watch Opus 4.6 reason through the paper:
- Phase detection classifies thinking into 7 stages (Reading Paper, Identifying Compartments, Extracting Parameters, Formulating ODE System, Setting Initial Conditions, Cross-referencing Results, Synthesizing Model)
- A dark typewriter console shows the current reasoning with phase transitions highlighted
- During parallel execution, the thinking replays with rotating excerpts so users stay engaged
- After pipeline completes, full thinking is available in an expandable console on the results page
This builds trust. When the model writes "I see 8 compartments: S, I, D, A, R, T, H, E" and then produces code with exactly those compartments, users verify the reasoning chain.
The Reader Agent receives the entire paper plus a knowledge base of model formulations, parameter ranges, and ODE solver best practices -- all in a single prompt. No chunking. No retrieval. No information loss.
This matters because mathematical models are defined across multiple sections of a paper. The ODE system is in Section 3, but the parameter values are in Table 2, the initial conditions in the supplementary material, and the validation targets in Figure 5. Chunking would lose these cross-references. The 1M context window holds everything.
The Builder Agent generates a complete Streamlit application -- model.py, solver.py, app.py, config.json, and requirements.txt -- in a single API call. No multi-turn generation, no template stitching.
This produces coherent code where the solver imports from the model, the app imports from the solver, and the config matches both. Fragmented generation produces fragmented code.
After the Reader completes, three agents run simultaneously via ThreadPoolExecutor:
Reader (max) --> [ Summarizer(medium) | Builder(high) | Coder(high) ] --> Validator --> Debugger
PARALLEL via ThreadPoolExecutor SEQUENTIAL
Each agent creates its own Anthropic client, makes its own API call at its own effort level, and returns independently. The UI shows live agent status badges with effort labels, updating as each completes.
Every agent returns data through Pydantic v2 schemas enforced via tool_use. The EpidemicModel schema is the contract between all agents -- 9 fields, strictly typed, validated on both send and receive. No free-text parsing. No regex extraction. Type-safe inter-agent communication.
Paper (PDF/arxiv)
|
v
Paper Loader --> Context Builder --> Reader Agent (adaptive, max effort)
|
+----------------+----------------+
| | |
Summarizer Builder Coder
(medium) (high) (high)
| | |
v v v
PaperSummary Simulator Files Standalone Script
|
Validator
|
Debugger (if needed, high effort)
|
3-Tab Streamlit UI
| Component | File | Opus 4.6 Feature |
|---|---|---|
| Reader | agents/reader.py |
Adaptive thinking (max), 1M context, tool use |
| Summarizer | agents/summarizer.py |
Adaptive thinking (medium), tool use |
| Builder | agents/builder.py |
Adaptive thinking (high), 128K output, tool use |
| Validator | agents/validator.py |
Pure Python -- subprocess execution, metric comparison |
| Debugger | agents/debugger.py |
Adaptive thinking (high), code analysis |
| Coder | agents/coder.py |
Adaptive thinking (high), tool use |
| Thinking Stream | core/thinking_stream.py |
Real-time thinking display with phase classification |
v1 -- Basic Pipeline (commits 4746ce6 to f0ad239)
Scaffolding, schemas, paper loader, agents wired sequentially. Reader used fixed budget_tokens. No UI. CLI only. Worked for SIR but failed on complex models.
v2 -- Streamlit App + Validation Loop (commits 12c9b4a to fc4e0ba)
Added the Streamlit interface, the Validator + Debugger self-healing loop, and fixed R0 formula handling for complex models (SIDARTHE has 8 compartments -- the simple beta/gamma formula doesn't apply). 76 tests covering SIR, SEIR, pipeline integration, and edge cases.
v3 -- Three-Tab UI + New Agents (commit 8a8e3c1)
Added Summarizer and Coder agents. Rewrote the app with Summary | Simulation | Code tabs. Realized the demo needed more than just charts -- judges want to see AI understanding, not just AI generating.
v4 -- Dark Scientific Theme (commit 1882fd9)
Complete visual redesign. Custom typography (Fraunces + Outfit + JetBrains Mono), dark blue-black palette with amber accents, glass-morphism cards, staggered animations. The UI went from "default Streamlit" to "scientific intelligence platform."
v5 -- Thinking Out Loud + Parallel Agents (commits 4800bba to 51a0ba1)
The breakthrough iteration. Switched from post-hoc thinking display to real-time streaming of thinking blocks into the UI. Added adaptive thinking with per-agent effort levels. Parallelized Summarizer + Builder + Coder for ~40% speed improvement. Added the typewriter console with phase classification, replay mode during parallel execution, and persistent thinking display on results page.
v6 -- Beyond Epidemiology (commit 51a0ba1+)
Discovered the pipeline is domain-agnostic. Tested on a Lotka-Volterra predator-prey ecology paper -- the Reader extracted the ODE system, the Builder generated a working simulator, no code changes needed. The architecture already reads any differential equation paper, not just epidemic ones. EpiSim started as epidemic modeling. It became a universal paper-to-simulator engine.
The key insight: showing the AI's reasoning process isn't just a demo trick -- it's a trust mechanism. When users can watch the model identify variables, extract parameters, and formulate ODEs step by step, they trust the output regardless of the domain.
EpiSim's pipeline -- read paper, extract math, generate simulator, validate, self-heal -- is domain-agnostic. Any field that publishes differential equation models in academic papers works today, with zero code changes:
| Domain | Tested Paper | What the Simulator Does |
|---|---|---|
| Epidemiology | SIDARTHE COVID-19 | 8-compartment epidemic curves with intervention parameter sliders |
| Epidemiology | SEIR Dengue | Seasonal dengue dynamics with transmission rate controls |
| Ecology | Predator-Prey dynamics | Lotka-Volterra population cycles with predation rate sliders |
The Reader Agent doesn't know what domain it's reading. It knows it's reading differential equations. The same 1M context window that extracts a COVID-19 SIDARTHE model extracts a predator-prey Lotka-Volterra model -- same pipeline, same agents, same output format.
Opus 4.6's extended thinking at max effort reasons through any dense academic paper -- epidemiology, ecology, pharmacokinetics, climate science, neuroscience. The architecture already supports it. The knowledge base is the only epidemic-specific component, and it's optional context, not a hard dependency.
git clone https://github.com/wpn10/episim.git
cd episim
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
export ANTHROPIC_API_KEY=your_key_herestreamlit run app.pyUpload a PDF or paste an arxiv ID (e.g. 2003.09861) and click Generate Simulator.
python -m episim.core.orchestrator --paper 2003.09861Suggested demo flow:
- Paste
2003.09861(SIDARTHE COVID-19 model) into the sidebar - Watch the thinking console stream the Reader's reasoning in real-time
- See parallel agents execute with effort-level badges
- Explore the Summary tab (paper digest)
- Drag parameter sliders on the Simulation tab, watch curves update
- Download the standalone script from the Code tab
pytest tests/ -v76 tests across 13 files: schema validation, PDF extraction, SIR/SEIR ODE solvers, pipeline integration, mocked agent tests, edge cases. All passing.
Python 3.10+ | Anthropic API (Opus 4.6) | scipy | Streamlit | Plotly | PyMuPDF | Pydantic v2
MIT
Built for the "Built with Opus 4.6" Claude Code Hackathon, February 2026.