feat: add ManiSkill3 environment integration by Shr1ftyy · Pull Request #29 · threetau/kinitro

Shr1ftyy · 2026-02-05T02:14:04Z

Summary

Add ManiSkill3 as a new environment family for GPU-accelerated dexterous manipulation tasks using SAPIEN physics engine.

New Environments

maniskill3/pick-cube-v1 - Pick up a cube
maniskill3/stack-cube-v1 - Stack one cube on another
maniskill3/peg-insertion-v1 - Insert peg into hole
maniskill3/push-cube-v1 - Push cube to target
maniskill3/pull-cube-v1 - Pull cube to target
maniskill3/lift-cube-v1 - Lift cube to height
maniskill3/turn-faucet-v1 - Turn a faucet handle
maniskill3/open-cabinet-door-v1 - Open a cabinet door

Changes

Add ManiSkill3Environment class implementing RoboticsEnvironment with canonical observation/action mapping
Add Docker container (environments/maniskill3/) with SAPIEN/Vulkan dependencies
Register all 8 environments in registry
Add maniskill3 to CLI available families

Implementation Details

Extracts TCP pose from ManiSkill3's state_dict observation format
Converts quaternion from ManiSkill3's wxyz to canonical xyzw format
Handles torch tensor returns (batched data with shape [1, ...])
Supports both GPU (Vulkan) and CPU rendering fallback

Note on Dependencies

ManiSkill3 dependencies are container-only due to gymnasium version conflict:

mani-skill requires gymnasium==0.29.1
kinitro requires gymnasium>=1.1

The container has its own requirements.txt with compatible versions.

Test Plan

Environment initialization and property access
Task generation with procedural randomization
Reset with correct TCP pose extraction
Step execution with action conversion
Camera observation capture (256x256 RGB)
Multiple environments tested (pick-cube, push-cube, stack-cube)
Registry integration verified

Summary by CodeRabbit

New Features
- Added ManiSkill3 robotic manipulation environment support with 8 manipulation tasks (pick, stack, peg-insertion, push, pull, lift, turn-faucet, open-cabinet)
- Introduced multi-worktree development infrastructure with isolated configurations and automatic port management
- Made Docker Compose service ports configurable via environment variables
Documentation
- Added End-to-End testing guide for multi-worktree development setup

Add scripts and documentation for running multiple worktrees in parallel without service collisions. - scripts/worktree-env.sh: generates .env and docker-compose.override.yml with deterministic port offsets based on worktree name - scripts/start-services.sh: manages service lifecycle per worktree - docs/e2e-testing-guide.md: comprehensive E2E testing documentation - .gitignore: exclude docker-compose.override.yml (generated file)

- docker-compose.yml: Use env variables for ports to avoid merge conflicts with override files - worktree-env.sh: Simplify override file to only set container names (ports now come from .env) - start-services.sh: Fix pkill patterns to reliably stop processes

- AGENTS.md: Fix $DB -> $POSTGRES_DB in troubleshooting command - worktree-env.sh: Add macOS compatibility (md5 fallback for md5sum) - start-services.sh: Use bash array for executor command (safer quoting) - start-services.sh: Implement documented --api-only/--scheduler/--executor flags - start-services.sh: Validate WORKTREE_NAME after sourcing .env - start-services.sh: Use _ for unused loop variable

Add ManiSkill3 as a new environment family for GPU-accelerated dexterous manipulation tasks using SAPIEN physics. New environments: - maniskill3/pick-cube-v1 - maniskill3/stack-cube-v1 - maniskill3/peg-insertion-v1 - maniskill3/push-cube-v1 - maniskill3/pull-cube-v1 - maniskill3/lift-cube-v1 - maniskill3/turn-faucet-v1 - maniskill3/open-cabinet-door-v1 Changes: - Add ManiSkill3Environment class with canonical observation/action mapping - Add Docker container with SAPIEN/Vulkan dependencies - Register environments in registry - Add maniskill3 to CLI available families Note: ManiSkill3 dependencies are container-only due to gymnasium version conflict (mani-skill requires 0.29.1, kinitro requires >=1.1)

coderabbitai · 2026-02-05T02:14:23Z

📝 Walkthrough

Walkthrough

Adds a new ManiSkill3 robotics evaluation environment with eight registered tasks, implements per-worktree development isolation via dynamic port configuration and service lifecycle scripts, and updates docker-compose to use environment-driven port bindings while removing legacy mani-skill2 dependencies.

Changes

Cohort / File(s)	Summary
ManiSkill3 Environment Implementation `environments/maniskill3/*`, `kinitro/environments/maniskill3_env.py`	Introduces new ManiSkill3 environment family with Dockerfile, Python Actor for Affinetes integration, dependencies manifest, metadata, and kinitro environment wrapper class supporting task generation, observation canonicalization, action translation, and image capture.
Environment Registry and CLI `kinitro/cli/env/commands.py`, `kinitro/environments/registry.py`	Registers eight new ManiSkill3 task variants (pick-cube, stack-cube, peg-insertion, push-cube, pull-cube, lift-cube, turn-faucet, open-cabinet-door) and adds "maniskill3" to supported environment families.
Per-Worktree Development Infrastructure `scripts/worktree-env.sh`, `scripts/start-services.sh`, `.gitignore`	Introduces deterministic port offset generation per worktree, service lifecycle management (start/stop/status/logs), and gitignore entry for per-worktree docker-compose overrides.
Docker and Configuration `docker-compose.yml`, `pyproject.toml`	Replaces static host ports with environment variable bindings across all services (Postgres, API, Prometheus, Grafana) and removes legacy mani-skill2 optional dependencies with note on maniskill3 containerization.
Documentation `AGENTS.md`	Adds End-to-End Testing sections detailing multi-worktree development, quick start, environment image building, logs, and troubleshooting across two locations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Add worktree-isolated development tooling #28: Overlaps on worktree-isolation tooling (.gitignore, docker-compose env vars, worktree-env.sh and service-start scripts).
refactor: split eval-env into separate metaworld and procthor environments #7: Follows the same environment family pattern with env.py Actor, Dockerfile, metadata.json, and registry integration for new task environments.

Suggested reviewers

rayokamoto

Poem

🐰 A new ManiSkill to master,
Per-worktree ports move faster,
Robots twist and gripper squeeze,
While services start with ease.
Infrastructure hops along,
Making multi-dev less wrong! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 73.68% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately summarizes the main change: adding ManiSkill3 as a new environment integration. The title is concise, clear, and directly reflects the primary objective of the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/maniskill-env

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@environments/maniskill3/Dockerfile`:
- Around line 60-65: The Dockerfile currently forces software rendering via the
ENV LIBGL_ALWAYS_SOFTWARE=1 which conflicts with the GPU-accelerated goal;
remove that hardcoded ENV or make it configurable (e.g., replace the fixed ENV
with a build ARG / optional ENV so the value can be left unset or set to 0 in
GPU builds), keep DISPLAY, MESA_GL_VERSION_OVERRIDE and VK_ICD_FILENAMES as-is
to support GL/Vulkan, and ensure any CI or runtime docs are updated to instruct
how to set LIBGL_ALWAYS_SOFTWARE when a software fallback is explicitly desired.

In `@environments/maniskill3/env.py`:
- Around line 152-153: Replace the direct raise of ValueError for the missing
base_url with the same structured error return used elsewhere: call
_build_error_result with an appropriate error message and status (matching how
invalid env_id is handled) instead of raising; update the base_url validation
branch (the check that currently does "if base_url is None") to return
_build_error_result("base_url (miner endpoint) is required", ...) so callers
receive a consistent error result format.

In `@kinitro/environments/maniskill3_env.py`:
- Around line 70-74: The docstring for camera_names is wrong — it claims the
default is ['base_camera', 'hand_camera'] while the code uses self.CAMERA_NAMES
(which is ['base_camera']). Update the docstring in the Maniskill3Env/__init__
(or surrounding class docstring) to either state the actual default
(['base_camera']) or state that it defaults to self.CAMERA_NAMES, and ensure the
symbol camera_names and self.CAMERA_NAMES are referenced so readers know the
source of the default.
- Around line 249-263: _extract_ee_velocity currently treats joint velocities as
Cartesian EE velocity by scaling qvel slices, which is incorrect; change it to
compute EE linear (and angular if available) velocity from the position-delta
used in _build_observation instead: read the current EE position from obs (e.g.,
obs["ee_pos"] or agent obs field used in _build_observation), subtract a stored
previous EE position (e.g., self._prev_ee_pos or obs["prev_ee_pos"]) and divide
by the environment timestep (e.g., self.control_timestep or self.dt) to produce
ee_lin_vel (and similarly for orientation/ee_ang_vel if previous orientation is
available), default to zeros if previous data or dt is missing, and remove the
qvel[:3]/qvel[3:6] 0.1 scaling; if you prefer higher fidelity later, replace
this with a Jacobian-based conversion when ManiSkill3 exposes the Jacobian API.

🧹 Nitpick comments (7)

environments/maniskill3/env.py (1)

43-43: Logger instantiated but never used.

The structlog logger is created but no logging statements exist in the file. Key events like evaluation start/completion, miner call failures, and timeouts would benefit from structured logging for observability and debugging.
kinitro/cli/env/commands.py (1)
57-67: Consider updating docstring examples to include maniskill3.

The docstring's "Environment families" section and examples only mention metaworld and procthor. Adding maniskill3 would keep documentation consistent with the updated AVAILABLE_ENV_FAMILIES constant.
📝 Proposed docstring update
     Environment families:
       - metaworld: MuJoCo-based manipulation tasks (~400MB image)
       - procthor: AI2-THOR procedural house tasks (~1.5GB image, x86_64 Linux only)
+      - maniskill3: SAPIEN/ManiSkill3 dexterous manipulation (~1GB image, requires Vulkan)

     Examples:
         # Build MetaWorld environment
         kinitro env build metaworld --tag kinitro/metaworld:v1

         # Build ProcTHOR environment
         kinitro env build procthor --tag kinitro/procthor:v1

+        # Build ManiSkill3 environment
+        kinitro env build maniskill3 --tag kinitro/maniskill3:v1
+
         # Build and push to registry
         kinitro env build metaworld --push --registry docker.io/myuser
scripts/worktree-env.sh (1)

41-44: Port collision risk with common services.

The calculated ports (base + 0-999 offset) could collide with other common services:

PostgreSQL: 5432-6431 (overlaps with X11 6000+)

API: 8000-8999 (common dev server range)

Prometheus: 9090-10089 (various monitoring tools)

Grafana: 3000-3999 (common dev tool range)

Consider using a higher base offset or validating port availability before generating the config.
scripts/start-services.sh (1)
71-80: Consider making --no-auth configurable.

The --no-auth flag is hardcoded, which disables API authentication. While acceptable for local development, consider making this configurable via an environment variable for scenarios where auth testing is needed:
Suggested improvement
 start_api() {
     echo "Starting API on port $API_PORT..."
+    local auth_flag=""
+    if [[ "${DISABLE_AUTH:-true}" == "true" ]]; then
+        auth_flag="--no-auth"
+    fi
     nohup uv run kinitro api \
         --database-url "$DATABASE_URL" \
         --port "$API_PORT" \
-        --no-auth \
+        $auth_flag \
         --log-level INFO \
         &> "$LOG_DIR/api.log" &
     echo "API started (PID: $!)"
 }
kinitro/environments/maniskill3_env.py (3)
393-418: Consider extracting tensor-to-scalar conversion helper.

The same tensor conversion pattern is repeated four times for reward, terminated, truncated, and success. A helper would reduce duplication:
Optional refactor
def _to_scalar(self, value: Any, as_bool: bool = False) -> float | bool:
    """Convert ManiSkill3 batched tensor/array to Python scalar."""
    if hasattr(value, "item"):
        result = value.item()
    elif hasattr(value, "cpu"):
        result = value.cpu().numpy().flat[0]
    else:
        result = value
    return bool(result) if as_bool else float(result)
Then usage:
reward = self._to_scalar(reward)
terminated = self._to_scalar(terminated, as_bool=True)
truncated = self._to_scalar(truncated, as_bool=True)
success = self._to_scalar(info.get("success", False), as_bool=True)
313-323: Image serialization may be inefficient.

Converting camera images to nested Python lists via img.tolist() (line 319) for 128×128×3 images creates ~49K element lists per camera. If performance becomes an issue, consider using base64-encoded bytes or keeping numpy arrays until serialization is needed.

238-246: Clarify quaternion format comment.

The comment "ManiSkill3 uses [x, y, z, qw, qx, qy, qz]" describes the full tcp_pose array (position + quaternion), which could be confusing. Consider clarifying:
-                # ManiSkill3 uses [x, y, z, qw, qx, qy, qz], convert to xyzw
+                # tcp_pose format: [pos_x, pos_y, pos_z, qw, qx, qy, qz]
+                # Convert quaternion from wxyz to xyzw order

coderabbitai · 2026-02-05T02:29:44Z

environments/maniskill3/Dockerfile

+ENV DISPLAY=:0
+ENV LIBGL_ALWAYS_SOFTWARE=1
+ENV MESA_GL_VERSION_OVERRIDE=3.3
+
+# For headless Vulkan rendering
+ENV VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json


⚠️ Potential issue | 🟠 Major

Software rendering enabled contradicts GPU acceleration goal.

LIBGL_ALWAYS_SOFTWARE=1 forces Mesa to use software rendering, which contradicts the PR's goal of "GPU-accelerated dexterous manipulation." This will significantly degrade rendering performance.

Consider making this configurable or removing it to allow GPU rendering when available:

Proposed fix

# Set rendering environment variables # SAPIEN can use Vulkan (preferred) or EGL ENV DISPLAY=:0 -ENV LIBGL_ALWAYS_SOFTWARE=1 +# Note: Set LIBGL_ALWAYS_SOFTWARE=1 at runtime for CPU-only fallback ENV MESA_GL_VERSION_OVERRIDE=3.3

🤖 Prompt for AI Agents

In `@environments/maniskill3/Dockerfile` around lines 60 - 65, The Dockerfile currently forces software rendering via the ENV LIBGL_ALWAYS_SOFTWARE=1 which conflicts with the GPU-accelerated goal; remove that hardcoded ENV or make it configurable (e.g., replace the fixed ENV with a build ARG / optional ENV so the value can be left unset or set to 0 in GPU builds), keep DISPLAY, MESA_GL_VERSION_OVERRIDE and VK_ICD_FILENAMES as-is to support GL/Vulkan, and ensure any CI or runtime docs are updated to instruct how to set LIBGL_ALWAYS_SOFTWARE when a software fallback is explicitly desired.

coderabbitai · 2026-02-05T02:29:44Z

environments/maniskill3/env.py

+        if base_url is None:
+            raise ValueError("base_url (miner endpoint) is required")


⚠️ Potential issue | 🟡 Minor

Inconsistent error handling: ValueError raised instead of returning error result.

Line 153 raises ValueError directly while all other validation errors (like invalid env_id on lines 142-150) return a structured error result via _build_error_result. This inconsistency means callers receive different error types depending on which validation fails.

🛠️ Proposed fix for consistent error handling

if base_url is None: - raise ValueError("base_url (miner endpoint) is required") + return self._build_error_result( + env_id=env_id, + task_id=task_id, + seed=seed or task_id, + start_time=time.time(), + error="base_url (miner endpoint) is required", + )

🤖 Prompt for AI Agents

In `@environments/maniskill3/env.py` around lines 152 - 153, Replace the direct raise of ValueError for the missing base_url with the same structured error return used elsewhere: call _build_error_result with an appropriate error message and status (matching how invalid env_id is handled) instead of raising; update the base_url validation branch (the check that currently does "if base_url is None") to return _build_error_result("base_url (miner endpoint) is required", ...) so callers receive a consistent error result format.

coderabbitai · 2026-02-05T02:29:44Z

kinitro/environments/maniskill3_env.py

+        Args:
+            task_name: Name of the task (e.g., 'pick-cube-v1')
+            use_camera: Whether to include camera observations
+            camera_names: Which cameras to use (default: ['base_camera', 'hand_camera'])
+            image_size: (width, height) for camera images


⚠️ Potential issue | 🟡 Minor

Docstring inconsistent with default value.

The docstring says camera_names defaults to ['base_camera', 'hand_camera'], but the actual default is self.CAMERA_NAMES which only contains ['base_camera'].

Fix docstring

- camera_names: Which cameras to use (default: ['base_camera', 'hand_camera']) + camera_names: Which cameras to use (default: ['base_camera'])

🤖 Prompt for AI Agents

In `@kinitro/environments/maniskill3_env.py` around lines 70 - 74, The docstring for camera_names is wrong — it claims the default is ['base_camera', 'hand_camera'] while the code uses self.CAMERA_NAMES (which is ['base_camera']). Update the docstring in the Maniskill3Env/__init__ (or surrounding class docstring) to either state the actual default (['base_camera']) or state that it defaults to self.CAMERA_NAMES, and ensure the symbol camera_names and self.CAMERA_NAMES are referenced so readers know the source of the default.

coderabbitai · 2026-02-05T02:29:44Z

kinitro/environments/maniskill3_env.py

+    def _extract_ee_velocity(self, obs: dict[str, Any]) -> tuple[np.ndarray, np.ndarray]:
+        """Extract end-effector velocity from observation."""
+        ee_lin_vel = np.zeros(3, dtype=np.float32)
+        ee_ang_vel = np.zeros(3, dtype=np.float32)
+
+        agent_obs = obs.get("agent", {})
+        if isinstance(agent_obs, dict) and "qvel" in agent_obs:
+            qvel = self._to_numpy(agent_obs["qvel"])
+            # Approximate ee velocity from joint velocities
+            # This is a simplification - proper FK would be needed for accuracy
+            if qvel.size >= 6:
+                ee_lin_vel = qvel[:3] * 0.1  # Scale factor approximation
+                ee_ang_vel = qvel[3:6] * 0.1
+
+        return ee_lin_vel, ee_ang_vel


⚠️ Potential issue | 🟡 Minor

Velocity approximation is physically incorrect.

Joint velocities (qvel[:3], qvel[3:6]) are in joint space, not Cartesian end-effector space. Scaling them by 0.1 doesn't produce meaningful EE velocity. The fallback position-delta computation in _build_observation (lines 305-307) is more accurate.

Consider either:

Always using position-delta for velocity (simpler, consistent)

Computing proper Jacobian-based velocity if ManiSkill3 exposes it

Option 1: Always use position delta

def _extract_ee_velocity(self, obs: dict[str, Any]) -> tuple[np.ndarray, np.ndarray]: """Extract end-effector velocity from observation.""" - ee_lin_vel = np.zeros(3, dtype=np.float32) - ee_ang_vel = np.zeros(3, dtype=np.float32) - - agent_obs = obs.get("agent", {}) - if isinstance(agent_obs, dict) and "qvel" in agent_obs: - qvel = self._to_numpy(agent_obs["qvel"]) - # Approximate ee velocity from joint velocities - # This is a simplification - proper FK would be needed for accuracy - if qvel.size >= 6: - ee_lin_vel = qvel[:3] * 0.1 # Scale factor approximation - ee_ang_vel = qvel[3:6] * 0.1 - - return ee_lin_vel, ee_ang_vel + # Return zeros; velocity will be computed from position delta in _build_observation + return np.zeros(3, dtype=np.float32), np.zeros(3, dtype=np.float32)

🤖 Prompt for AI Agents

In `@kinitro/environments/maniskill3_env.py` around lines 249 - 263, _extract_ee_velocity currently treats joint velocities as Cartesian EE velocity by scaling qvel slices, which is incorrect; change it to compute EE linear (and angular if available) velocity from the position-delta used in _build_observation instead: read the current EE position from obs (e.g., obs["ee_pos"] or agent obs field used in _build_observation), subtract a stored previous EE position (e.g., self._prev_ee_pos or obs["prev_ee_pos"]) and divide by the environment timestep (e.g., self.control_timestep or self.dt) to produce ee_lin_vel (and similarly for orientation/ee_ang_vel if previous orientation is available), default to zeros if previous data or dt is missing, and remove the qvel[:3]/qvel[3:6] 0.1 scaling; if you prefer higher fidelity later, replace this with a Jacobian-based conversion when ManiSkill3 exposes the Jacobian API.

Shr1ftyy · 2026-02-05T08:47:50Z

shelving for now. prioritising #30 and already existing envs.

Shr1ftyy added 6 commits February 4, 2026 16:30

docs: add E2E testing section to AGENTS.md

936e8f1

docs: consolidate E2E testing info into AGENTS.md

d39ea1d

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

Shr1ftyy closed this Feb 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ManiSkill3 environment integration#29

feat: add ManiSkill3 environment integration#29
Shr1ftyy wants to merge 6 commits intomainfrom
feat/maniskill-env

Shr1ftyy commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

coderabbitai bot Feb 5, 2026

Uh oh!

Shr1ftyy commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if base_url is None:
		raise ValueError("base_url (miner endpoint) is required")

Conversation

Shr1ftyy commented Feb 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New Environments

Changes

Implementation Details

Note on Dependencies

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Shr1ftyy commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Shr1ftyy commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 5, 2026 •

edited

Loading