A Bittensor subnet for evaluating generalist robotics policies across diverse simulated environments.
This subnet incentivizes the development of generalist robotics policies - AI systems that can control robots across multiple different tasks and environments. Unlike narrow RL policies trained for single tasks, miners must submit policies that perform well across a diverse set of simulated robotics environments.
Only policies that generalize across ALL environments earn rewards, using ε-Pareto dominance scoring.
kinitro-eval-anim.mov
Miners receive limited observations to prevent overfitting:
- Proprioceptive: End-effector XYZ position + gripper state (4 values)
- Visual: RGB camera images from corner cameras (84x84)
- Object positions are NOT exposed - miners must learn from visual input
- Procedural task generation: Every evaluation uses fresh, procedurally-generated task instances
- Seed rotation: Seeds change each block - miners can't pre-compute solutions
- Domain randomization: Physics parameters and visual properties are randomized
- Sybil-proof: Copies tie under Pareto dominance, no benefit from multiple identities
- Copy-proof: Must improve on the leader to earn, not just match them
- Specialization-proof: Must dominate on ALL environments, not just one
- Deployment verification: Spot-checks verify Basilica deployments match HuggingFace uploads
Miners are scored using winners-take-all over environment subsets:
- For each subset of environments, find the miner that dominates
- Award points scaled by subset size (larger subsets = more points)
- Convert to weights via softmax
This rewards true generalists over specialists.
The subnet uses a split service architecture with miners deployed on Basilica:
flowchart TB
subgraph Chain["Bittensor Chain"]
Commitments[("Miner Commitments")]
Weights[("Validator Weights")]
end
subgraph API["API Service (kinitro api)"]
RestAPI["REST API"]
TaskPool["Task Pool Manager"]
DB[("PostgreSQL")]
end
subgraph Scheduler["Scheduler Service (kinitro scheduler)"]
TaskGen["Task Generator"]
Scoring["Pareto Scoring"]
end
subgraph Executor["Executor Service(s) (kinitro executor)"]
E1["Executor 1 (GPU)"]
E2["Executor 2 (GPU)"]
En["Executor N (GPU)"]
end
subgraph Validators["Validator(s) (kinitro validate)"]
V1["Validator 1"]
Vn["Validator N"]
end
subgraph Basilica["Basilica (Miner Policy Servers)"]
M1["Miner 1"]
Mn["Miner N"]
end
%% Chain interactions
Commitments -->|"Read miners"| Scheduler
V1 & Vn -->|"Submit weights"| Weights
%% Scheduler flow
TaskGen -->|"Create tasks"| DB
DB -->|"Read scores"| Scoring
Scoring -->|"Save weights"| DB
%% Executor flow
E1 & E2 & En -->|"Fetch tasks"| TaskPool
E1 & E2 & En -->|"Submit results"| TaskPool
TaskPool <-->|"Read/Write"| DB
%% Executor to Miners
E1 & E2 & En -->|"Get actions"| M1 & Mn
%% Validator flow
RestAPI -->|"GET /weights"| V1 & Vn
| Service | Command | Purpose | Scaling |
|---|---|---|---|
| API | kinitro api |
REST API, task pool management | Horizontal (stateless) |
| Scheduler | kinitro scheduler |
Task generation, scoring, weight computation | Single instance |
| Executor | kinitro executor |
Run MuJoCo evaluations via Affinetes | Horizontal (GPU machines) |
| Validator | kinitro validate |
Submit weights to chain | Per validator |
- Miners deploy policy servers to Basilica and commit their endpoint on-chain
- Scheduler reads miner commitments from chain to discover Basilica endpoints
- Scheduler creates evaluation tasks in PostgreSQL (task pool)
- Executor(s) fetch tasks from API (
POST /v1/tasks/fetch) - Executor runs MuJoCo simulation, calls miner endpoints for actions
- Executor submits results to API (
POST /v1/tasks/submit) - Scheduler computes Pareto scores when cycle complete and saves weights
- Validators poll
GET /v1/weights/latestand submit to chain
# Clone and install
git clone https://github.com/threetau/kinitro.git
cd kinitro
# Install with uv (recommended)
uv sync
# Or with pip
pip install -e .See the full Miner Guide for detailed instructions on how to train and deploy a policy.
# 1. Initialize a policy template
uv run kinitro miner init ./my-policy
cd my-policy
# 2. Implement your policy in policy.py
# 3. Test locally
uvicorn server:app --port 8001
# 4. Upload to HuggingFace
huggingface-cli upload your-username/kinitro-policy .
# 5. Deploy to Basilica
export BASILICA_API_TOKEN="your-api-token"
uv run kinitro miner push \
--repo your-username/kinitro-policy \
--revision YOUR_HF_COMMIT_SHA \
--gpu-count 1 --min-vram 16
# 6. Register on chain
uv run kinitro miner commit \
--repo your-username/kinitro-policy \
--revision YOUR_HF_COMMIT_SHA \
--endpoint YOUR_BASILICA_URL \
--netuid YOUR_NETUID \
--network finneySee the full Validator Guide for detailed instructions on setting up a validator (lightweight).
# Start the validator (submits weights to chain)
uv run kinitro validate \
--backend-url https://api.kinitro.ai \
--netuid 26 \
--network finney \
--wallet-name your-wallet \
--hotkey-name your-hotkeySee the full Backend Guide for instructions on how to run the evaluation backend (subnet operators only).
# 1. Start PostgreSQL
docker run -d --name kinitro-postgres \
-e POSTGRES_USER=kinitro -e POSTGRES_PASSWORD=secret -e POSTGRES_DB=kinitro \
-p 5432:5432 postgres:15
# 2. Build the evaluation environment images
uv run kinitro env build metaworld --tag kinitro/metaworld:v1
uv run kinitro env build procthor --tag kinitro/procthor:v1
# 3. Initialize database
uv run kinitro db init --database-url postgresql://kinitro:secret@localhost/kinitro
# 4. Start the services (split architecture)
# Terminal 1: API Service
uv run kinitro api --database-url postgresql://kinitro:secret@localhost/kinitro
# Terminal 2: Scheduler Service
uv run kinitro scheduler \
--netuid YOUR_NETUID \
--network finney \
--database-url postgresql://kinitro:secret@localhost/kinitro
# Terminal 3+: Executor(s) - can run multiple on different GPU machines
uv run kinitro executor --api-url http://localhost:8000The API service exposes these endpoints:
| Endpoint | Description |
|---|---|
GET /health |
Health check |
GET /v1/status |
Current backend status |
GET /v1/weights/latest |
Latest computed weights |
GET /v1/weights/{block} |
Weights for specific block |
GET /v1/scores/latest |
Latest evaluation scores |
GET /v1/scores/{cycle_id} |
Scores for specific cycle |
GET /v1/miners |
List evaluated miners |
GET /v1/environments |
List environments |
POST /v1/tasks/fetch |
Fetch tasks (for executors) |
POST /v1/tasks/submit |
Submit results (for executors) |
GET /v1/tasks/stats |
Task pool statistics |
MuJoCo-based robot arm manipulation tasks:
metaworld/reach-v3- Move end-effector to target positionmetaworld/push-v3- Push object to goal locationmetaworld/pick-place-v3- Pick up object and place at targetmetaworld/door-open-v3- Open a doormetaworld/drawer-open-v3- Open a drawermetaworld/drawer-close-v3- Close a drawermetaworld/button-press-v3- Press a button from top-downmetaworld/peg-insert-v3- Insert peg into hole
AI2-THOR procedural house environments for embodied AI tasks:
procthor/v0- Procedural house tasks (pickup, place, open, close, toggle)
Use kinitro env list to see all available environments.
Miners deploy a FastAPI server with these endpoints:
# POST /reset - Reset for new episode
async def reset(task_config: dict) -> str:
"""Called at start of each episode. Returns episode_id."""
pass
# POST /act - Get action for observation
async def act(observation: np.ndarray, images: dict | None) -> np.ndarray:
"""
Return action for observation. Must respond within 500ms.
Args:
observation: Proprioceptive state [ee_x, ee_y, ee_z, gripper_state]
images: Optional camera images {"corner": (84,84,3), "gripper": (84,84,3)}
Returns:
Action as numpy array in [-1, 1] range
"""
return actionSee the Miner Guide and kinitro/miner/template/ for complete examples.
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run tests with MuJoCo
MUJOCO_GL=egl pytest tests/
# Type checking
mypy kinitro/
# Linting
ruff check kinitro/Use environment variables or a .env file. See .env.example for configuration options.
MIT