Skip to content

feat(rl_interface): extensible dict-based Observation/Action + E2E docs#30

Merged
rayokamoto merged 14 commits intomainfrom
feat/creating-environments
Feb 6, 2026
Merged

feat(rl_interface): extensible dict-based Observation/Action + E2E docs#30
rayokamoto merged 14 commits intomainfrom
feat/creating-environments

Conversation

@Shr1ftyy
Copy link
Collaborator

@Shr1ftyy Shr1ftyy commented Feb 5, 2026

Summary

  • Migrate from fixed-field CanonicalObservation/CanonicalAction to extensible dict-based Observation/Action classes
  • Add ProprioKeys and ActionKeys for conventional channel naming
  • Base64 image encoding/decoding for efficient serialization
  • Remove duplicate root-level rl_interface.py (single source of truth is now kinitro/rl_interface.py)
  • Improve E2E testing documentation in AGENTS.md

Changes

RL Interface (kinitro/rl_interface.py)

  • Observation: proprio dict with ProprioKeys (EE_POS, EE_QUAT, GRIPPER, BASE_POS, etc.)
  • Action: continuous/discrete dicts with ActionKeys (EE_TWIST, GRIPPER, BASE_VEL, etc.)
  • Helper methods: get_proprio(), get_continuous(), proprio_array(), continuous_array()
  • Image handling: get_image(), images property for base64 decode

Environment Updates

  • MetaWorld and ProcTHOR environments updated to use new interface
  • Actor classes updated for new Action format

Documentation

  • Updated environments/README.md with new interface examples
  • Improved AGENTS.md with:
    • Correct API endpoint paths (/v1/ prefix)
    • Basilica deployment testing section
    • Cleanup commands
    • Database schema reference
    • --mock-miner flag for services.sh

Test plan

  • Unit tests pass (116 tests)
  • Linting passes (ruff check .)
  • E2E test with docker executor and Basilica miner deployment

Summary by CodeRabbit

  • New Features

    • New standardized Observation/Action API, environment template with Actor, and optional mock-miner service with service management.
  • Documentation

    • Quick Reference, new End-to-End Testing guide, updated miner guide and condensed environment onboarding docs.
  • Bug Fixes

    • Stricter action validation and clearer error results when no action is returned.
  • Tests

    • Updated E2E tests and new API/mock-miner test and orchestration scripts.

Replace fixed-field CanonicalObservation/CanonicalAction with extensible
dict-based Observation/Action classes:

- Observation: proprio dict with ProprioKeys (EE_POS, EE_QUAT, GRIPPER, etc.)
- Action: continuous/discrete dicts with ActionKeys (EE_TWIST, GRIPPER, etc.)
- Base64 image encoding/decoding for efficient serialization
- coerce_action() helper for backward compatibility with array inputs

Remove duplicate root-level rl_interface.py; single source of truth is
now kinitro/rl_interface.py (copied to build context by env build command).
- Fix API endpoint paths (use /v1/ prefix, not /api/v1/)
- Add Basilica deployment testing section with deploy/delete workflow
- Add cleanup commands for containers and services
- Add database schema reference with query examples
- Document Basilica CLI installation
- Add --mock-miner flag to services.sh for E2E testing
@coderabbitai
Copy link

coderabbitai bot commented Feb 5, 2026

📝 Walkthrough

Walkthrough

Migrates the RL public API from CanonicalObservation/CanonicalAction to extensible Observation/Action schemas (adds ProprioKeys, ActionKeys, image encode/decode, coercion helpers), updates rl_interface, environments, miner templates, CLI/testing, scripts, docs, and tests to the new payload shapes.

Changes

Cohort / File(s) Summary
RL core
kinitro/rl_interface.py, rl_interface.py
Introduce Observation, Action, ProprioKeys, ActionKeys, image encode/decode, Step, coercion/normalization helpers; remove old re-export wrapper in rl_interface.py.
Environment surface & impl
kinitro/environments/..., environments/metaworld/env.py, environments/procthor/env.py, kinitro/environments/metaworld_env.py
Signatures switched to Observation/Action; observation builders reworked (rgb/proprio/depth/cam intrinsics/extrinsics/extra); action extraction/validation uses Action and key enums.
Miner templates & servers
kinitro/miner/template/policy.py, kinitro/miner/template/env.py, kinitro/miner/template/server.py
Templates updated to consume Observation and return Action (continuous/discrete split); use Action.model_validate/from_array and ActionKeys.
CLI, mock miner & testing
kinitro/cli/..., kinitro/cli/miner/mock.py, kinitro/cli/testing_commands.py, tests/integration/...
CLI, mock miner, and tests updated to use Observation/Action; random/deterministic actions now emitted as continuous lists under ActionKeys; action serialization via model_dump.
Executor & verifier
kinitro/executor/verification.py
Verifier now sends Observation payloads and parses Action responses via Action.model_validate/continuous_array; local/remote inference paths adapted.
Scripts & services
scripts/services.sh, scripts/test-e2e.sh, scripts/test-api.sh, scripts/test-mock-miner.sh
Adds --mock-miner and start_mock_miner(); new E2E and API/mock-miner test scripts; extended start/stop/status handling and orchestration flows.
Docs & guides
AGENTS.md, docs/miner-guide.md, environments/README.md, docs/e2e-testing.md
Docs/examples updated for new Observation/Action shapes, Quick Reference/workflow scripts, E2E testing guide, and templated environment instructions.
Environment templates
environments/_template/*
Adds Affinetes-compatible environment template (Dockerfile, Actor env.py, metadata, requirements) implementing evaluation loop and miner interaction.
Tooling config & tests
pyproject.toml, tests/...
Adds kinitro/miner/template to Ty exclusion; tests adapted to new Action/Observation APIs.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(220,230,255,0.5)
  participant CLI as CLI / Orchestrator
  participant Miner as Miner (policy server)
  participant Env as Simulation Env
  participant RL as kinitro.rl_interface
  end

  CLI->>Env: reset() → Observation
  Env->>RL: build Observation (rgb, proprio via ProprioKeys)
  RL-->>CLI: Observation payload (to_payload / model_dump)
  CLI->>Miner: POST /act (ActRequest obs: Observation)
  Miner->>RL: Action.model_validate / coerce_action
  RL-->>Miner: validated Action
  Miner-->>CLI: ActResponse(action: Action)
  CLI->>Env: step(action: Action)
  Env-->>CLI: Observation, reward, done, info
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • rayokamoto
  • rishiad

Poem

🐰 I hopped through types and nibbled the old array,
I stitched proprio keys and cameras into play.
Actions now come nested, twists tucked in a list,
Observations carry images, poses, and gist.
Hop, validate, deliver—pipeline upgraded, I insist!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main changes: introducing extensible dict-based Observation/Action classes in the RL interface and adding E2E testing documentation.
Docstring Coverage ✅ Passed Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/creating-environments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Shr1ftyy Shr1ftyy self-assigned this Feb 5, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/miner-guide.md (1)

120-148: ⚠️ Potential issue | 🟡 Minor

Align the surrounding docs with the new Observation/Action schema.
The example now uses Observation and Action(continuous=...), but earlier sections and the API request/response examples still show legacy keys like ee_pos_m, twist_ee_norm, and top-level obs fields. Please update those sections to show obs: {proprio: {...}, rgb: {...}} and action: {continuous: {...}} so readers don’t mix formats.

kinitro/environments/metaworld_env.py (1)

390-407: ⚠️ Potential issue | 🔴 Critical

Fix CI type error: serialize rgb payload before building Observation.

CI reports invalid-argument-type because camera_views is dict[str, np.ndarray] while Observation.rgb is typed as serializable list/dict. Convert images to lists (or widen Observation.rgb typing) before construction.

🐛 Suggested fix (serialize camera views)
-        return Observation(
-            rgb=camera_views,  # Will be auto-encoded if numpy arrays
-            proprio={
-                ProprioKeys.EE_POS: ee_pos.tolist(),
-                ProprioKeys.EE_QUAT: ee_quat.tolist(),
-                ProprioKeys.EE_VEL_LIN: ee_lin_vel.tolist(),
-                ProprioKeys.EE_VEL_ANG: ee_ang_vel.tolist(),
-                ProprioKeys.GRIPPER: [gripper_norm],
-            },
-        )
+        rgb_payload = {name: img.tolist() for name, img in camera_views.items()}
+        return Observation(
+            rgb=rgb_payload,
+            proprio={
+                ProprioKeys.EE_POS: ee_pos.tolist(),
+                ProprioKeys.EE_QUAT: ee_quat.tolist(),
+                ProprioKeys.EE_VEL_LIN: ee_lin_vel.tolist(),
+                ProprioKeys.EE_VEL_ANG: ee_ang_vel.tolist(),
+                ProprioKeys.GRIPPER: [gripper_norm],
+            },
+        )
🤖 Fix all issues with AI agents
In `@AGENTS.md`:
- Around line 178-235: The /act example in AGENTS.md uses legacy observation
keys (ee_pos_m, ee_quat_xyzw, gripper_01) and must be updated to the new
Observation schema that nests sensor data under obs.proprio using the
ProprioKeys names; update the mock-miner example curl to send obs.proprio with
the current ProprioKeys (replace ee_pos_m, ee_quat_xyzw, ee_lin_vel_mps,
ee_ang_vel_rps, gripper_01) when calling the /act endpoint of the mock-miner
(mock-miner, /act, and the uv run kinitro mock-miner example) so the documented
payload matches the migrated schema.

In `@kinitro/environments/metaworld_env.py`:
- Around line 518-533: The step method currently assumes coerce_action returns
arrays of expected lengths and directly indexes twist_arr and gripper_arr; add
explicit validation for ActionKeys.EE_TWIST and ActionKeys.GRIPPER sizes before
indexing: coerce the arrays to numpy (e.g., np.asarray), check twist_arr is not
None and twist_arr.size >= 6 (otherwise raise ValueError with a clear message or
fill a default zero-6 array), and check gripper_arr is not None and
gripper_arr.size >= 1 (otherwise raise ValueError or default to 0.0); then
compute twist = np.clip(first-6 values, -1.0, 1.0) and gripper =
float(np.clip(gripper_arr[0], 0.0, 1.0)) using the validated arrays, updating
variables twist_arr, gripper_arr, twist, and gripper in the step function (where
coerce_action, ActionKeys.EE_TWIST, and ActionKeys.GRIPPER are referenced).

In `@kinitro/rl_interface.py`:
- Around line 312-327: The from_array method should validate that the flattened
input length matches the sum of schema sizes to avoid silent under/overflow:
compute expected = sum(schema.values()), convert array to arr =
np.asarray(array, dtype=np.float32).flatten(), then if arr.size != expected
raise a ValueError (include expected and actual sizes in the message) before
proceeding to slice into continuous; reference symbols: from_array, schema, arr,
continuous, idx, cls.

In `@scripts/services.sh`:
- Line 138: The pkill invocation ("pkill -f \"kinitro mock-miner\"") in
scripts/services.sh is too broad and will kill mock-miner processes across all
worktrees; narrow the match to the worktree-specific port by matching the
mock-miner command including the HTTP port variable (e.g. include "--http-port
${MOCK_MINER_HTTP_PORT}" or the equivalent port flag used when launching
mock-miner) so only the instance for the current worktree is targeted; update
the pkill pattern accordingly in the same scripts/services.sh pkill line.
🧹 Nitpick comments (5)
scripts/services.sh (1)

155-155: Status filter could show mock-miners from other worktrees.

The literal mock-miner in the grep pattern will match any mock-miner process. For consistency with the worktree isolation design, consider filtering by the mock-miner port instead.

♻️ Suggested improvement
-    ps aux | grep -E "kinitro.*(api|scheduler|executor|mock-miner)" | grep -v grep | grep -E "($WORKTREE_NAME|$API_PORT|$DATABASE_URL|mock-miner)" || echo "  None running"
+    ps aux | grep -E "kinitro.*(api|scheduler|executor|mock-miner)" | grep -v grep | grep -E "($WORKTREE_NAME|$API_PORT|$DATABASE_URL|${MOCK_MINER_PORT:-8001})" || echo "  None running"
environments/README.md (2)

101-108: Add language specifier to fenced code block.

The code block showing directory structure should have a language specifier for consistency. Use text or plaintext for directory trees.

📝 Suggested fix
-```
+```text
 environments/
 └── myrobot/
     ├── Dockerfile          # Container build spec
     ├── env.py              # Actor class for affinetes
     ├── requirements.txt    # Python dependencies
     └── metadata.json       # Display info
</details>

---

`721-734`: **Add language specifier to fenced code block.**

Same as above—use `text` or `plaintext` for directory structure blocks.


<details>
<summary>📝 Suggested fix</summary>

```diff
-```
+```text
 environments/robomanip/
 ├── Dockerfile
 ├── env.py
 ├── requirements.txt
 └── metadata.json

 kinitro/environments/robomanip/
 ├── __init__.py
 ├── environment.py
 ├── task_types.py
 ├── task_generator.py
 └── scene_generator.py
</details>

</blockquote></details>
<details>
<summary>kinitro/miner/template/env.py (1)</summary><blockquote>

`120-127`: **Consider using ProprioKeys constants for consistency.**

The code uses string literals (`"ee_pos"`, `"ee_quat"`, etc.) to read proprio data, but uses `ActionKeys` constants when constructing the Action. For consistency and to reduce typo risk, consider importing and using `ProprioKeys` constants.


<details>
<summary>♻️ Suggested improvement</summary>

```diff
 # Import from local rl_interface (self-contained for Basilica deployment)
-from rl_interface import Action, ActionKeys
+from rl_interface import Action, ActionKeys, ProprioKeys

 ...

         # Extract proprioceptive data from new format
         proprio = observation.get("proprio", {})
-        ee_pos = np.array(proprio.get("ee_pos", [0, 0, 0]), dtype=np.float32)
-        ee_quat = np.array(proprio.get("ee_quat", [0, 0, 0, 1]), dtype=np.float32)
-        ee_vel_lin = np.array(proprio.get("ee_vel_lin", [0, 0, 0]), dtype=np.float32)
-        ee_vel_ang = np.array(proprio.get("ee_vel_ang", [0, 0, 0]), dtype=np.float32)
-        gripper_state = proprio.get("gripper", [0.0])[0]
+        ee_pos = np.array(proprio.get(ProprioKeys.EE_POS, [0, 0, 0]), dtype=np.float32)
+        ee_quat = np.array(proprio.get(ProprioKeys.EE_QUAT, [0, 0, 0, 1]), dtype=np.float32)
+        ee_vel_lin = np.array(proprio.get(ProprioKeys.EE_VEL_LIN, [0, 0, 0]), dtype=np.float32)
+        ee_vel_ang = np.array(proprio.get(ProprioKeys.EE_VEL_ANG, [0, 0, 0]), dtype=np.float32)
+        gripper_state = proprio.get(ProprioKeys.GRIPPER, [0.0])[0]
kinitro/miner/template/policy.py (1)

49-75: Add return type annotation to act() method.

Per coding guidelines, public functions and methods should have return types. The method returns Action.

📝 Suggested fix
-    async def act(self, observation: Observation):
+    async def act(self, observation: Observation) -> Action:
         """
         Get action for current observation.

The coerce_action() function was providing false backward compatibility
by silently accepting arrays and converting them to Action objects. This
masked API changes and added complexity.

Changes:
- Remove coerce_action() from kinitro/rl_interface.py
- Update all step() methods to accept only Action objects
- Update Actor classes to construct Action objects explicitly
- Update CLI test command to construct Action directly
- Update documentation examples
@Shr1ftyy Shr1ftyy force-pushed the feat/creating-environments branch from c72a401 to 171cc4b Compare February 5, 2026 08:08
- Update AGENTS.md /act example to use `proprio` dict with ProprioKeys
  instead of legacy flat keys (ee_pos_m, ee_quat_xyzw, etc.)
- Fix services.sh mock-miner pkill to use MOCK_MINER_PORT for worktree
  isolation instead of matching all mock-miners globally
@Shr1ftyy Shr1ftyy force-pushed the feat/creating-environments branch from 171cc4b to 0a84b8b Compare February 5, 2026 08:09
Address PR feedback to make environments/README.md human-first instead of
AI-first by moving code boilerplate into a working template.

- Create environments/_template/ with Dockerfile, env.py, requirements.txt,
  and metadata.json containing inline TODOs
- Trim README from 1210 lines to 114 lines
- Keep existing environment docs, add concise "Adding New Environments" section
Remove defensive multi-format action parsing. Miners must now return
a valid Action schema or receive an error. This makes the API contract
clear and simplifies the code.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@environments/_template/env.py`:
- Around line 102-110: The error-return branch that builds the failure result
for invalid env_id uses "seed or task_id", which treats 0 as falsy and can
override a valid seed=0; update the call to _build_error_result to pass the seed
explicitly using an explicit None check (e.g., seed if seed is not None else
task_id) instead of "seed or task_id" so an intentional seed of 0 is preserved
when building the error result for env_id, task_id, seed, start_time, and error.

In `@environments/README.md`:
- Around line 101-104: The README uses the lowercase "affinetes" in the
description for Actor.evaluate(); update the text to consistently use the proper
noun "Affinetes" (e.g., change "called by affinetes" to "called by Affinetes")
and scan nearby references (lines mentioning Actor.evaluate(), TaskConfig,
Observation, Action) to ensure all occurrences match the capitalized form for
brand consistency.
🧹 Nitpick comments (1)
environments/_template/env.py (1)

235-242: Add explicit return type to cleanup().

♻️ Proposed refactor
-    async def cleanup(self):
+    async def cleanup(self) -> None:

As per coding guidelines: Add return types to public functions and methods.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@environments/_template/env.py`:
- Around line 226-233: The cleanup method lacks a return type and silently
swallows exceptions; update async def cleanup to include the return type
annotation "-> None" and replace the bare except/pass with structured logging of
the exception (use the module's structlog logger, e.g. logger.error or
logger.exception) including context like the env key or env.__class__.__name__
and the operation ("closing env"), then continue closing remaining envs and
finally clear self._env_cache; reference the async cleanup method,
self._env_cache and env.close() when making the change.
🧹 Nitpick comments (2)
environments/_template/env.py (2)

35-38: Add an explicit return type on __init__.

It’s a public method and should be annotated for consistency.

Suggested change
-    def __init__(self):
+    def __init__(self) -> None:
As per coding guidelines, add return types to public functions and methods.

199-206: Log exceptions and avoid returning full tracebacks to callers.

Log the detailed traceback using logger.exception(), and return only a concise error message in the result payload. This follows the guideline to provide actionable error messages and log details with structlog.

Suggested change
-        except Exception as e:
-            return self._build_error_result(
-                env_id=env_id,
-                task_id=task_id,
-                seed=seed,
-                start_time=start_time,
-                error=f"{type(e).__name__}: {e}\n{traceback.format_exc()}",
-            )
+        except Exception as e:
+            logger.exception(
+                "Evaluation failed",
+                env_id=env_id,
+                task_id=task_id,
+                seed=seed,
+            )
+            return self._build_error_result(
+                env_id=env_id,
+                task_id=task_id,
+                seed=seed,
+                start_time=start_time,
+                error=f"{type(e).__name__}: {e}",
+            )

Update observation and action field names to match the actual
rl_interface schema:
- Observation: ee_pos, ee_quat, ee_vel_lin, ee_vel_ang, gripper
- Action: continuous.ee_twist, continuous.gripper

Also update all curl examples to use the correct nested format.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
docs/miner-guide.md (4)

85-95: ⚠️ Potential issue | 🟡 Minor

Clarify gripper shape in proprio to match examples.

The text says gripper is in [0, 1] (scalar), but all examples use a list [1.0]. Pick one representation and make it consistent across prose and payloads.

📝 Suggested doc tweak (if list is intended)
- - `gripper`: Gripper state in [0, 1]
+ - `gripper`: Gripper state as a list, e.g. [1.0], in [0, 1]

119-148: ⚠️ Potential issue | 🟡 Minor

Example uses uuid without import.

reset() returns uuid.uuid4().hex but uuid isn’t imported, so the snippet won’t run as-is.

🔧 Proposed fix
-from kinitro.rl_interface import Action, ActionKeys, Observation
+import uuid
+from kinitro.rl_interface import Action, ActionKeys, Observation

475-501: ⚠️ Potential issue | 🟡 Minor

API spec mixes scalar vs list in text vs examples.

The spec example uses list [1.0] for gripper, but the earlier prose implies scalar. Please make the spec and prose unambiguous.


576-599: ⚠️ Potential issue | 🟡 Minor

Local/remote curl examples repeat the gripper shape mismatch.

Both examples use list gripper; please align the earlier description to avoid client-side parsing errors.

🤖 Fix all issues with AI agents
In `@docs/miner-guide.md`:
- Around line 106-113: The documentation currently mismatches the gripper shape
between the Action continuous dict and the proprio section; pick a single
canonical shape (preferably a scalar float in [0,1] to match the proprio
description) and update the Action docs accordingly: change the `gripper`
example and type note in the Action Space block (where `Action` continuous dict,
`ee_twist`, and `gripper` are described) to describe `gripper` as a single float
scalar in [0,1] and adjust the example from `[1.0]` to `1.0`; ensure any other
mentions of `gripper` elsewhere use the same scalar convention (or conversely
choose list semantics everywhere if you prefer lists).
- Line 186: The example payload uses "gripper": [1.0] while the earlier prose
says “gripper in [0,1]”; make them consistent by either changing the example
payload to use a scalar "gripper": 1.0 if the expected shape is a single value,
or update the prose to state that gripper is a one-element list (e.g., “gripper
is a list containing a value in [0,1]”). Locate the example payload string
containing '{"obs": {"proprio": {..., "gripper": [1.0]}, ...}}' and the prose
line that says “gripper in [0,1]” and edit one of them so both describe the same
shape and range.
🧹 Nitpick comments (1)
docs/miner-guide.md (1)

85-86: “Canonical observation” wording is misleading after API change.

The new API is dict-based Observation; calling it “canonical observation” can be read as the legacy fixed-field schema. Consider rephrasing to “Observation” or “standardized observation payload.”

- Remove _coerce_images, _coerce_proprio, _coerce_continuous validators
- Make encode_image/decode_image public for explicit image serialization
- Update metaworld and procthor environments to use encode_image explicitly
- Type hints now accurately reflect expected input types
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@kinitro/environments/procthor/environment.py`:
- Around line 561-571: The step method can IndexError when twist_arr or
gripper_arr are shorter than expected; update the handling in step (variables
twist_arr, gripper_arr, and keys ActionKeys.EE_TWIST / ActionKeys.GRIPPER) to
defensively pad or truncate incoming arrays: for EE_TWIST build a length-6 array
filled with zeros and copy up to the first min(len(twist_arr),6) values before
clipping, and for GRIPPER ensure you read the first element only after creating
a length-1 padded array (or default 0.0) so indexing never fails; then apply the
existing np.clip and float conversion as before.

Move verbose E2E testing section from AGENTS.md into:
- docs/e2e-testing.md: detailed testing guide
- scripts/test-api.sh: test API endpoints
- scripts/test-mock-miner.sh: test mock miner endpoints
- scripts/test-e2e.sh: full pipeline test

Also fix `kinitro mock-miner` -> `kinitro miner mock` in services.sh.

AGENTS.md now contains a quick reference pointing to the scripts.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/services.sh (1)

118-156: ⚠️ Potential issue | 🟠 Major

Stop/status still target the old mock-miner command name.

Line 122 starts kinitro miner mock, but Line 139 and Line 156 grep for mock-miner, so the process won’t be stopped or shown. Please align the patterns.

🔧 Align stop/status patterns with the actual command
-    pkill -f "kinitro mock-miner.*--port $miner_port" 2>/dev/null || true
+    pkill -f "kinitro miner mock.*--port $miner_port" 2>/dev/null || true
@@
-    ps aux | grep -E "kinitro.*(api|scheduler|executor|mock-miner)" | grep -v grep | grep -E "($WORKTREE_NAME|$API_PORT|$DATABASE_URL|mock-miner)" || echo "  None running"
+    ps aux | grep -E "kinitro.*(api|scheduler|executor|miner mock)" | grep -v grep | grep -E "($WORKTREE_NAME|$API_PORT|$DATABASE_URL|miner mock)" || echo "  None running"
🤖 Fix all issues with AI agents
In `@scripts/test-api.sh`:
- Around line 65-75: The script currently uses set -e which aborts on the first
failing test; change the test invocation pattern so every test runs and results
are aggregated: either remove/temporarily disable set -e around the block and
implement a small wrapper (e.g., create a new run_test function) that calls
test_endpoint "<name>" "<path>" "<expected>" and captures its exit code,
appending the name to PASSED or FAILED arrays and returning success without
exiting the script; alternatively call test_endpoint in a subshell or with "||
true" and record $? into counters. Update all invocations (the lines using
test_endpoint for "health", "miners list", "environments list", "latest scores",
"latest weights", and "task stats") to use the wrapper so the full suite
completes and the final summary can report aggregated PASSED/FAILED counts.

In `@scripts/test-e2e.sh`:
- Around line 137-153: The script currently does a single check for evaluations
using SCORES_RESPONSE/SCORES_BODY/SCORES_CODE and ignores the TIMEOUT option;
change this to poll until the TIMEOUT expires: record a start timestamp, loop
making the same curl request (using API_URL to build the URL and capturing
SCORES_RESPONSE/SCORES_BODY/SCORES_CODE the same way), break from the loop if
SCORES_CODE == "200", otherwise sleep a short interval (e.g., 5s) and re-check
until elapsed time >= TIMEOUT; after the loop preserve the existing success
output when 200 and the existing "No completed evaluations yet" message when
timed out, and consider introducing a SLEEP_INTERVAL variable for clarity.

In `@scripts/test-mock-miner.sh`:
- Around line 1-4: The startup hint in scripts/test-mock-miner.sh still
references the old CLI subcommand "kinitro mock-miner"; update all occurrences
of that string (including the other block later in the file) to the new CLI form
"kinitro miner mock" so the help text and examples match the current
services.sh/service CLI usage.
- Around line 65-88: The script currently uses set -e which aborts on the first
failing test so the aggregated PASSED/FAILED summary never runs; modify the test
invocation pattern so each call to test_endpoint is guarded (e.g., call
test_endpoint ... || true) or temporarily disable errexit around the test block
(set +e before the tests and set -e after) so all tests run and the existing
PASSED/FAILED counters (the variables updated by test_endpoint) are still
incremented and the final summary prints; locate references to set -e and the
calls to test_endpoint to apply this change.

- Fix pkill/ps patterns to match `miner mock` (not `mock-miner`)
  so stop/status commands work correctly for mock miner processes
- Update CLI hints in test scripts to show correct command
- Disable set -e during test execution so all tests run and
  the summary shows accurate pass/fail counts
- Implement polling loop in test-e2e.sh to honor --timeout flag
@Shr1ftyy Shr1ftyy force-pushed the feat/creating-environments branch from dcbc05a to 307da68 Compare February 5, 2026 14:37
All continuous channels use list format for uniformity, including
single-value channels like gripper.
Replace hard exits on missing .env in services.sh and test-e2e.sh with
informational notes and inline defaults, matching the pattern already
used by test-api.sh and test-mock-miner.sh. Update AGENTS.md and
e2e-testing.md to reflect that worktree-env.sh is optional.
@rayokamoto rayokamoto merged commit 601b2ef into main Feb 6, 2026
2 checks passed
@rayokamoto rayokamoto deleted the feat/creating-environments branch February 6, 2026 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants