Skip to content

Open-Athena/binder-lab

Repository files navigation

Binder-Lab: Protein Binder Evaluation Suite

A comprehensive toolkit for designing and evaluating protein binders using state-of-the-art methods like BoltzGen.

Features

  • BoltzGen Integration - Complete wrapper for protein binder design
  • Docker Support - Containerized execution with GPU support
  • uv Package Manager - Lightning-fast installation (~10 seconds)
  • Evaluation Metrics - Structure prediction and quality assessment
  • Modular Architecture - Extensible design for multiple methods
  • YAML Configuration - Flexible, human-readable config files
  • Snakemake Workflow - Reproducible pipeline orchestration

Quick Start

Installation

# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone https://github.com/Open-Athena/binder-lab.git
cd binder-lab
uv pip install -e .

Verify Installation

uv run pytest -sv test/

Running BoltzGen

Prerequisites

  1. BoltzGen Docker image - Build or pull the boltzgen image
  2. GPU with Docker access - NVIDIA GPU with container toolkit installed
  3. HuggingFace cache - Models download automatically (~6GB)

Option 1: Python API

uv run python examples/run_boltzgen_test.py targets/1UBQ.pdb results/ubiquitin_binder

This runs a complete BoltzGen design pipeline:

  • Target: Ubiquitin (76 residues, chain A)
  • Designs: 10 binders, keeps top 5
  • Binder length: 60-80 residues
  • Output: results/ubiquitin_binder/

Option 2: Direct Docker Command

Run BoltzGen directly via Docker (useful for debugging):

# Create output directory
mkdir -p results/test_output

# Create a design specification YAML
cat > results/design_spec.yaml << 'EOF'
entities:
- file:
    path: /work/1UBQ.pdb
    include:
    - chain:
        id: A
- protein:
    id: B
    sequence: 60..80
EOF

# Run BoltzGen
docker run --rm --gpus all \
  -v $(pwd)/targets:/work \
  -v $(pwd)/results:/output \
  -v ~/.cache:/cache \
  -e HF_HOME=/cache \
  boltzgen \
  boltzgen run /output/design_spec.yaml \
    --output /output/test_output \
    --protocol protein-anything \
    --num_designs 2 \
    --budget 1 \
    --filter_biased true \
    --cache /cache

Option 3: Full Python Script

from pathlib import Path
from binder_lab.design import BoltzGenDesigner, DesignSpec

config = {
    'container_image': 'boltzgen',
    'container_type': 'docker',
    'protocol': 'protein-anything',
    'num_designs': 10,
    'budget': 5,
    'alpha': 0.5,
    'filter_biased': True,
}

spec = DesignSpec(
    name="ubiquitin_binder",
    target={
        'structure_path': 'targets/1UBQ.pdb',
        'chains': ['A'],
    },
    designed_component={
        'type': 'protein',
        'id': 'B',
        'length': [60, 80],
    }
)

designer = BoltzGenDesigner(config, Path('results/my_designs'))
designs = designer.design(spec)

for design in designs:
    print(f"{design.design_id}: {design.sequences[0]['sequence'][:30]}...")

Evaluate Existing Designs

If you have existing designs (CIF + NPZ files from BoltzGen or other tools):

# Ingest designs from a BoltzGen output directory
uv run python scripts/ingest_from_boltzgen.py \
    results/ubiquitin_binder/boltzgen_test_binder \
    results/designs.yaml

# Or from a generic design directory with CIF/NPZ files
uv run python scripts/ingest_from_design_dir.py \
    test/data/design_dir_cif_npz/oqo-1 \
    results/designs.yaml

Run Snakemake Evaluation Pipeline

# Setup output directory
mkdir -p results/evaluation

# Copy config
cp examples/config1.yaml results/evaluation/config.yaml

# Run pipeline (requires Apptainer)
snakemake --cores all \
    --use-apptainer \
    --apptainer-args="--nv --bind $(pwd)/resources:/resources" \
    --resources gpu=1 \
    --config workdir=results/evaluation

Structure Predictors

Three structure predictors are available:

from pathlib import Path
from binder_lab.predictors import get_predictor

# Boltz predictor
predictor = get_predictor('boltz', {
    'container_image': 'boltz:latest',
    'recycling_steps': 3,
}, Path('results/predictions'))

# AlphaFold3 predictor
predictor = get_predictor('af3', {
    'container_image': 'alphafold3:latest',
    'model_params': '/path/to/af3_params',
    'database_dir': '/path/to/databases',
}, Path('results/predictions'))

# Chai-1 predictor
predictor = get_predictor('chai', {
    'container_image': 'chai1:latest',
}, Path('results/predictions'))

Project Structure

binder-lab/
├── binder_lab/           # Main package
│   ├── design/          # Design methods (BoltzGen)
│   ├── predictors/      # Structure predictors (Boltz, AF3, Chai)
│   ├── metrics/         # Evaluation metrics
│   ├── tasks/           # Task orchestration
│   ├── cli/             # Command-line interface
│   └── utils/           # Utilities (containers)
├── examples/            # Example configs and scripts
├── scripts/             # Utility scripts
├── targets/             # Example target structures
│   ├── 1UBQ.pdb        # Ubiquitin (76 residues)
│   └── 6BJ9.cif        # Larger example
├── docker/              # Container definitions
├── resources/           # Model weights
└── Snakefile           # Workflow orchestration

Configuration Reference

BoltzGen Config Options

Option Default Description
container_image boltzgen:latest Docker image name
container_type apptainer docker or apptainer
protocol protein-anything Design protocol
num_designs 100 Number of designs to generate
budget 10 Final designs to keep
alpha 0.5 Quality-diversity tradeoff
filter_biased True Filter biased designs
cache_dir ~/.cache HuggingFace cache location

Design Spec Options

name: "my_design"
target:
  structure_path: "targets/1UBQ.pdb"
  chains: ["A"]
  residues: [1, 10, 20]  # Optional: specific residues
designed_component:
  type: "protein"        # or "peptide"
  id: "B"               # Chain ID (max 5 chars)
  length: [60, 80]      # [min, max] residues
  cyclic: false         # For peptides only
constraints:
  binding_site:
    residues: [10, 15, 20]
  secondary_structure: "HHHHHLLLLEEEE"

Testing

# Run all tests
uv run pytest -sv test/

# Run specific test file
uv run pytest -sv test/test_predictors.py

# Run with coverage
uv run pytest --cov=binder_lab test/

Requirements

  • Python >= 3.8 (3.11 recommended)
  • Docker or Apptainer for containerized execution
  • GPU with ~6GB VRAM (recommended)
  • Disk Space ~6GB for model cache

Development

# Install dev dependencies
uv pip install -e ".[dev]"

# Format code
uv run black binder_lab/
uv run ruff check binder_lab/

# Run tests
uv run pytest

License

MIT License

Acknowledgments

  • BoltzGen - Protein binder design pipeline
  • Boltz/AlphaFold3/Chai-1 - Structure prediction
  • ABCFold - Structure prediction integration
  • uv - Fast Python package management

About

This is all WIP and vibe-coded

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published