Conversation
Add scripts and documentation for running multiple worktrees in parallel without service collisions. - scripts/worktree-env.sh: generates .env and docker-compose.override.yml with deterministic port offsets based on worktree name - scripts/start-services.sh: manages service lifecycle per worktree - docs/e2e-testing-guide.md: comprehensive E2E testing documentation - .gitignore: exclude docker-compose.override.yml (generated file)
- docker-compose.yml: Use env variables for ports to avoid merge conflicts with override files - worktree-env.sh: Simplify override file to only set container names (ports now come from .env) - start-services.sh: Fix pkill patterns to reliably stop processes
- AGENTS.md: Fix $DB -> $POSTGRES_DB in troubleshooting command - worktree-env.sh: Add macOS compatibility (md5 fallback for md5sum) - start-services.sh: Use bash array for executor command (safer quoting) - start-services.sh: Implement documented --api-only/--scheduler/--executor flags - start-services.sh: Validate WORKTREE_NAME after sourcing .env - start-services.sh: Use _ for unused loop variable
Add ManiSkill3 as a new environment family for GPU-accelerated dexterous manipulation tasks using SAPIEN physics. New environments: - maniskill3/pick-cube-v1 - maniskill3/stack-cube-v1 - maniskill3/peg-insertion-v1 - maniskill3/push-cube-v1 - maniskill3/pull-cube-v1 - maniskill3/lift-cube-v1 - maniskill3/turn-faucet-v1 - maniskill3/open-cabinet-door-v1 Changes: - Add ManiSkill3Environment class with canonical observation/action mapping - Add Docker container with SAPIEN/Vulkan dependencies - Register environments in registry - Add maniskill3 to CLI available families Note: ManiSkill3 dependencies are container-only due to gymnasium version conflict (mani-skill requires 0.29.1, kinitro requires >=1.1)
📝 WalkthroughWalkthroughAdds a new ManiSkill3 robotics evaluation environment with eight registered tasks, implements per-worktree development isolation via dynamic port configuration and service lifecycle scripts, and updates docker-compose to use environment-driven port bindings while removing legacy mani-skill2 dependencies. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@environments/maniskill3/Dockerfile`:
- Around line 60-65: The Dockerfile currently forces software rendering via the
ENV LIBGL_ALWAYS_SOFTWARE=1 which conflicts with the GPU-accelerated goal;
remove that hardcoded ENV or make it configurable (e.g., replace the fixed ENV
with a build ARG / optional ENV so the value can be left unset or set to 0 in
GPU builds), keep DISPLAY, MESA_GL_VERSION_OVERRIDE and VK_ICD_FILENAMES as-is
to support GL/Vulkan, and ensure any CI or runtime docs are updated to instruct
how to set LIBGL_ALWAYS_SOFTWARE when a software fallback is explicitly desired.
In `@environments/maniskill3/env.py`:
- Around line 152-153: Replace the direct raise of ValueError for the missing
base_url with the same structured error return used elsewhere: call
_build_error_result with an appropriate error message and status (matching how
invalid env_id is handled) instead of raising; update the base_url validation
branch (the check that currently does "if base_url is None") to return
_build_error_result("base_url (miner endpoint) is required", ...) so callers
receive a consistent error result format.
In `@kinitro/environments/maniskill3_env.py`:
- Around line 70-74: The docstring for camera_names is wrong — it claims the
default is ['base_camera', 'hand_camera'] while the code uses self.CAMERA_NAMES
(which is ['base_camera']). Update the docstring in the Maniskill3Env/__init__
(or surrounding class docstring) to either state the actual default
(['base_camera']) or state that it defaults to self.CAMERA_NAMES, and ensure the
symbol camera_names and self.CAMERA_NAMES are referenced so readers know the
source of the default.
- Around line 249-263: _extract_ee_velocity currently treats joint velocities as
Cartesian EE velocity by scaling qvel slices, which is incorrect; change it to
compute EE linear (and angular if available) velocity from the position-delta
used in _build_observation instead: read the current EE position from obs (e.g.,
obs["ee_pos"] or agent obs field used in _build_observation), subtract a stored
previous EE position (e.g., self._prev_ee_pos or obs["prev_ee_pos"]) and divide
by the environment timestep (e.g., self.control_timestep or self.dt) to produce
ee_lin_vel (and similarly for orientation/ee_ang_vel if previous orientation is
available), default to zeros if previous data or dt is missing, and remove the
qvel[:3]/qvel[3:6] 0.1 scaling; if you prefer higher fidelity later, replace
this with a Jacobian-based conversion when ManiSkill3 exposes the Jacobian API.
🧹 Nitpick comments (7)
environments/maniskill3/env.py (1)
43-43: Logger instantiated but never used.The
structloglogger is created but no logging statements exist in the file. Key events like evaluation start/completion, miner call failures, and timeouts would benefit from structured logging for observability and debugging.kinitro/cli/env/commands.py (1)
57-67: Consider updating docstring examples to include maniskill3.The docstring's "Environment families" section and examples only mention
metaworldandprocthor. Addingmaniskill3would keep documentation consistent with the updatedAVAILABLE_ENV_FAMILIESconstant.📝 Proposed docstring update
Environment families: - metaworld: MuJoCo-based manipulation tasks (~400MB image) - procthor: AI2-THOR procedural house tasks (~1.5GB image, x86_64 Linux only) + - maniskill3: SAPIEN/ManiSkill3 dexterous manipulation (~1GB image, requires Vulkan) Examples: # Build MetaWorld environment kinitro env build metaworld --tag kinitro/metaworld:v1 # Build ProcTHOR environment kinitro env build procthor --tag kinitro/procthor:v1 + # Build ManiSkill3 environment + kinitro env build maniskill3 --tag kinitro/maniskill3:v1 + # Build and push to registry kinitro env build metaworld --push --registry docker.io/myuserscripts/worktree-env.sh (1)
41-44: Port collision risk with common services.The calculated ports (base + 0-999 offset) could collide with other common services:
- PostgreSQL: 5432-6431 (overlaps with X11 6000+)
- API: 8000-8999 (common dev server range)
- Prometheus: 9090-10089 (various monitoring tools)
- Grafana: 3000-3999 (common dev tool range)
Consider using a higher base offset or validating port availability before generating the config.
scripts/start-services.sh (1)
71-80: Consider making--no-authconfigurable.The
--no-authflag is hardcoded, which disables API authentication. While acceptable for local development, consider making this configurable via an environment variable for scenarios where auth testing is needed:Suggested improvement
start_api() { echo "Starting API on port $API_PORT..." + local auth_flag="" + if [[ "${DISABLE_AUTH:-true}" == "true" ]]; then + auth_flag="--no-auth" + fi nohup uv run kinitro api \ --database-url "$DATABASE_URL" \ --port "$API_PORT" \ - --no-auth \ + $auth_flag \ --log-level INFO \ &> "$LOG_DIR/api.log" & echo "API started (PID: $!)" }kinitro/environments/maniskill3_env.py (3)
393-418: Consider extracting tensor-to-scalar conversion helper.The same tensor conversion pattern is repeated four times for
reward,terminated,truncated, andsuccess. A helper would reduce duplication:Optional refactor
def _to_scalar(self, value: Any, as_bool: bool = False) -> float | bool: """Convert ManiSkill3 batched tensor/array to Python scalar.""" if hasattr(value, "item"): result = value.item() elif hasattr(value, "cpu"): result = value.cpu().numpy().flat[0] else: result = value return bool(result) if as_bool else float(result)Then usage:
reward = self._to_scalar(reward) terminated = self._to_scalar(terminated, as_bool=True) truncated = self._to_scalar(truncated, as_bool=True) success = self._to_scalar(info.get("success", False), as_bool=True)
313-323: Image serialization may be inefficient.Converting camera images to nested Python lists via
img.tolist()(line 319) for 128×128×3 images creates ~49K element lists per camera. If performance becomes an issue, consider using base64-encoded bytes or keeping numpy arrays until serialization is needed.
238-246: Clarify quaternion format comment.The comment "ManiSkill3 uses [x, y, z, qw, qx, qy, qz]" describes the full
tcp_posearray (position + quaternion), which could be confusing. Consider clarifying:- # ManiSkill3 uses [x, y, z, qw, qx, qy, qz], convert to xyzw + # tcp_pose format: [pos_x, pos_y, pos_z, qw, qx, qy, qz] + # Convert quaternion from wxyz to xyzw order
| ENV DISPLAY=:0 | ||
| ENV LIBGL_ALWAYS_SOFTWARE=1 | ||
| ENV MESA_GL_VERSION_OVERRIDE=3.3 | ||
|
|
||
| # For headless Vulkan rendering | ||
| ENV VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json |
There was a problem hiding this comment.
Software rendering enabled contradicts GPU acceleration goal.
LIBGL_ALWAYS_SOFTWARE=1 forces Mesa to use software rendering, which contradicts the PR's goal of "GPU-accelerated dexterous manipulation." This will significantly degrade rendering performance.
Consider making this configurable or removing it to allow GPU rendering when available:
Proposed fix
# Set rendering environment variables
# SAPIEN can use Vulkan (preferred) or EGL
ENV DISPLAY=:0
-ENV LIBGL_ALWAYS_SOFTWARE=1
+# Note: Set LIBGL_ALWAYS_SOFTWARE=1 at runtime for CPU-only fallback
ENV MESA_GL_VERSION_OVERRIDE=3.3🤖 Prompt for AI Agents
In `@environments/maniskill3/Dockerfile` around lines 60 - 65, The Dockerfile
currently forces software rendering via the ENV LIBGL_ALWAYS_SOFTWARE=1 which
conflicts with the GPU-accelerated goal; remove that hardcoded ENV or make it
configurable (e.g., replace the fixed ENV with a build ARG / optional ENV so the
value can be left unset or set to 0 in GPU builds), keep DISPLAY,
MESA_GL_VERSION_OVERRIDE and VK_ICD_FILENAMES as-is to support GL/Vulkan, and
ensure any CI or runtime docs are updated to instruct how to set
LIBGL_ALWAYS_SOFTWARE when a software fallback is explicitly desired.
| if base_url is None: | ||
| raise ValueError("base_url (miner endpoint) is required") |
There was a problem hiding this comment.
Inconsistent error handling: ValueError raised instead of returning error result.
Line 153 raises ValueError directly while all other validation errors (like invalid env_id on lines 142-150) return a structured error result via _build_error_result. This inconsistency means callers receive different error types depending on which validation fails.
🛠️ Proposed fix for consistent error handling
if base_url is None:
- raise ValueError("base_url (miner endpoint) is required")
+ return self._build_error_result(
+ env_id=env_id,
+ task_id=task_id,
+ seed=seed or task_id,
+ start_time=time.time(),
+ error="base_url (miner endpoint) is required",
+ )🤖 Prompt for AI Agents
In `@environments/maniskill3/env.py` around lines 152 - 153, Replace the direct
raise of ValueError for the missing base_url with the same structured error
return used elsewhere: call _build_error_result with an appropriate error
message and status (matching how invalid env_id is handled) instead of raising;
update the base_url validation branch (the check that currently does "if
base_url is None") to return _build_error_result("base_url (miner endpoint) is
required", ...) so callers receive a consistent error result format.
| Args: | ||
| task_name: Name of the task (e.g., 'pick-cube-v1') | ||
| use_camera: Whether to include camera observations | ||
| camera_names: Which cameras to use (default: ['base_camera', 'hand_camera']) | ||
| image_size: (width, height) for camera images |
There was a problem hiding this comment.
Docstring inconsistent with default value.
The docstring says camera_names defaults to ['base_camera', 'hand_camera'], but the actual default is self.CAMERA_NAMES which only contains ['base_camera'].
Fix docstring
- camera_names: Which cameras to use (default: ['base_camera', 'hand_camera'])
+ camera_names: Which cameras to use (default: ['base_camera'])🤖 Prompt for AI Agents
In `@kinitro/environments/maniskill3_env.py` around lines 70 - 74, The docstring
for camera_names is wrong — it claims the default is ['base_camera',
'hand_camera'] while the code uses self.CAMERA_NAMES (which is ['base_camera']).
Update the docstring in the Maniskill3Env/__init__ (or surrounding class
docstring) to either state the actual default (['base_camera']) or state that it
defaults to self.CAMERA_NAMES, and ensure the symbol camera_names and
self.CAMERA_NAMES are referenced so readers know the source of the default.
| def _extract_ee_velocity(self, obs: dict[str, Any]) -> tuple[np.ndarray, np.ndarray]: | ||
| """Extract end-effector velocity from observation.""" | ||
| ee_lin_vel = np.zeros(3, dtype=np.float32) | ||
| ee_ang_vel = np.zeros(3, dtype=np.float32) | ||
|
|
||
| agent_obs = obs.get("agent", {}) | ||
| if isinstance(agent_obs, dict) and "qvel" in agent_obs: | ||
| qvel = self._to_numpy(agent_obs["qvel"]) | ||
| # Approximate ee velocity from joint velocities | ||
| # This is a simplification - proper FK would be needed for accuracy | ||
| if qvel.size >= 6: | ||
| ee_lin_vel = qvel[:3] * 0.1 # Scale factor approximation | ||
| ee_ang_vel = qvel[3:6] * 0.1 | ||
|
|
||
| return ee_lin_vel, ee_ang_vel |
There was a problem hiding this comment.
Velocity approximation is physically incorrect.
Joint velocities (qvel[:3], qvel[3:6]) are in joint space, not Cartesian end-effector space. Scaling them by 0.1 doesn't produce meaningful EE velocity. The fallback position-delta computation in _build_observation (lines 305-307) is more accurate.
Consider either:
- Always using position-delta for velocity (simpler, consistent)
- Computing proper Jacobian-based velocity if ManiSkill3 exposes it
Option 1: Always use position delta
def _extract_ee_velocity(self, obs: dict[str, Any]) -> tuple[np.ndarray, np.ndarray]:
"""Extract end-effector velocity from observation."""
- ee_lin_vel = np.zeros(3, dtype=np.float32)
- ee_ang_vel = np.zeros(3, dtype=np.float32)
-
- agent_obs = obs.get("agent", {})
- if isinstance(agent_obs, dict) and "qvel" in agent_obs:
- qvel = self._to_numpy(agent_obs["qvel"])
- # Approximate ee velocity from joint velocities
- # This is a simplification - proper FK would be needed for accuracy
- if qvel.size >= 6:
- ee_lin_vel = qvel[:3] * 0.1 # Scale factor approximation
- ee_ang_vel = qvel[3:6] * 0.1
-
- return ee_lin_vel, ee_ang_vel
+ # Return zeros; velocity will be computed from position delta in _build_observation
+ return np.zeros(3, dtype=np.float32), np.zeros(3, dtype=np.float32)🤖 Prompt for AI Agents
In `@kinitro/environments/maniskill3_env.py` around lines 249 - 263,
_extract_ee_velocity currently treats joint velocities as Cartesian EE velocity
by scaling qvel slices, which is incorrect; change it to compute EE linear (and
angular if available) velocity from the position-delta used in
_build_observation instead: read the current EE position from obs (e.g.,
obs["ee_pos"] or agent obs field used in _build_observation), subtract a stored
previous EE position (e.g., self._prev_ee_pos or obs["prev_ee_pos"]) and divide
by the environment timestep (e.g., self.control_timestep or self.dt) to produce
ee_lin_vel (and similarly for orientation/ee_ang_vel if previous orientation is
available), default to zeros if previous data or dt is missing, and remove the
qvel[:3]/qvel[3:6] 0.1 scaling; if you prefer higher fidelity later, replace
this with a Jacobian-based conversion when ManiSkill3 exposes the Jacobian API.
|
shelving for now. prioritising #30 and already existing envs. |
Summary
Add ManiSkill3 as a new environment family for GPU-accelerated dexterous manipulation tasks using SAPIEN physics engine.
New Environments
maniskill3/pick-cube-v1- Pick up a cubemaniskill3/stack-cube-v1- Stack one cube on anothermaniskill3/peg-insertion-v1- Insert peg into holemaniskill3/push-cube-v1- Push cube to targetmaniskill3/pull-cube-v1- Pull cube to targetmaniskill3/lift-cube-v1- Lift cube to heightmaniskill3/turn-faucet-v1- Turn a faucet handlemaniskill3/open-cabinet-door-v1- Open a cabinet doorChanges
ManiSkill3Environmentclass implementingRoboticsEnvironmentwith canonical observation/action mappingenvironments/maniskill3/) with SAPIEN/Vulkan dependenciesmaniskill3to CLI available familiesImplementation Details
state_dictobservation formatNote on Dependencies
ManiSkill3 dependencies are container-only due to gymnasium version conflict:
mani-skillrequiresgymnasium==0.29.1kinitrorequiresgymnasium>=1.1The container has its own
requirements.txtwith compatible versions.Test Plan
Summary by CodeRabbit
New Features
Documentation