-
Notifications
You must be signed in to change notification settings - Fork 404
fix(exit-detection): use explicit EXIT_SIGNAL instead of confidence threshold #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hreshold Replace confidence-based heuristic in update_exit_signals() with explicit EXIT_SIGNAL checking. JSON mode always has confidence >= 70 due to deterministic scoring, causing completion_indicators to fill after 5 loops and triggering premature exits even when Claude sets EXIT_SIGNAL: false. Changes: - lib/response_analyzer.sh: Check exit_signal == "true" instead of confidence >= 60 when updating completion_indicators array - ralph_loop.sh: Update safety circuit breaker comment to reflect that completion_indicators now only accumulates on EXIT_SIGNAL=true - tests/unit/test_exit_detection.bats: Add 4 TDD tests (32-35) validating the fix for update_exit_signals() behavior - CLAUDE.md: Document fix as v0.11.1, update test counts (420 → 424) Test count: 424 passing (100% pass rate)
WalkthroughThe PR replaces a confidence-threshold heuristic with explicit EXIT_SIGNAL checking for completion indicator accumulation. The core logic change shifts from checking Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Switch exit detection to explicit
|
Code Review: Fix Exit DetectionThis is a well-executed fix for the premature exit bug in JSON mode. The root cause analysis is accurate and the solution is elegant. Strengths
Minor Observations
SuggestionsConsider adding an integration test that simulates a full multi-loop scenario where Claude works on a feature across several iterations with exit_signal=false, then signals completion with exit_signal=true. This would verify the complete flow end-to-end. VerdictApproved - The fix is correct, well-tested, and maintains backward compatibility with existing behavior while fixing the critical bug in JSON mode. |
|
Review complete. I've posted a comprehensive code review to PR #132: #132 (comment) The fix is well-executed - it correctly replaces a flawed confidence-based heuristic with explicit EXIT_SIGNAL checking, preventing premature exits in JSON mode. The 4 new TDD tests thoroughly validate the fix, and documentation updates are appropriate. |
Code Review - PR 132: Exit Detection FixOverall AssessmentAPPROVED - This is an excellent bug fix with comprehensive test coverage and clear documentation. The fix addresses a critical issue where JSON mode caused premature exits after exactly 5 loops. Code Quality (5/5 stars)Strengths:
lib/response_analyzer.sh Changes:
ralph_loop.sh Changes:
Test Coverage (5/5 stars)4 New Tests (Tests 32-35): Test 32: Verifies completion_indicators NOT updated when exit_signal=false with high confidence (70)
Test 33: Verifies completion_indicators IS updated when exit_signal=true
Test 34: Tests accumulation logic across multiple loops
Test 35: EXACT BUG SIMULATION
Test Quality:
Potential Issues and SuggestionsMinor: Missing null safety checkLocation: lib/response_analyzer.sh:567 The fix uses jq with a default fallback which provides safety. Current code works correctly - if jq fails or analysis_file is malformed, exit_signal could be empty string, but the == check handles this safely (empty string != "true"), so this is just a minor observation, not a blocker. Documentation ClarityThe inline comment in response_analyzer.sh is excellent. Consider also updating the function-level docstring for update_exit_signals() to mention the EXIT_SIGNAL requirement (if docstrings exist). Performance ConsiderationsNo performance impact - the change replaces one jq query with another of similar complexity. Security ConsiderationsNo security concerns. The fix doesn't introduce any new attack vectors or vulnerabilities. Breaking ChangesNone - This is a bug fix that aligns behavior with documented intent. The change makes the system MORE reliable, not less. Alignment with Repository StandardsPer CLAUDE.md Feature Development Quality Standards:
RecommendationsBefore Merge:
Optional Enhancements (Post-Merge):
Final VerdictAPPROVED - Ready to Merge This PR exemplifies excellent bug fix practices:
The fix addresses a critical usability issue where JSON mode users experienced premature exits after exactly 5 loops, even when work was ongoing. The solution correctly aligns implementation with documented behavior - respecting Claude's explicit EXIT_SIGNAL intent rather than relying on confidence heuristics. Test Count: 424 passing (100% pass rate) |

Summary
Problem
In JSON output mode,
update_exit_signals()used a confidence threshold (≥60) to populatecompletion_indicators. JSON mode always has confidence ≥70 due to deterministic scoring:resultfieldThis caused every successful JSON response to increment
completion_indicators, triggering the safety circuit breaker after exactly 5 loops—even when Claude explicitly setEXIT_SIGNAL: false.Solution
Replace confidence-based logic in
update_exit_signals()with explicit EXIT_SIGNAL checking:Test plan
completion_indicatorsNOT updated whenexit_signal=falsecompletion_indicatorsIS updated whenexit_signal=trueexit_signal=trueexit_signal=false(exact bug scenario)Summary by CodeRabbit
Release Notes v0.11.1
Bug Fixes
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.