🤖 fix: reuse cached skill diagnostics on transient discovery failures#2417
🤖 fix: reuse cached skill diagnostics on transient discovery failures#2417ammar-agent wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90bbfce452
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
90bbfce to
c0a90e3
Compare
|
@codex review Addressed the transient discovery regression by keeping non-diagnostic skill discovery resilient to per-skill transient I/O failures while retaining diagnostics-path transient rethrow behavior. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 565cf01152
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
565cf01 to
4faaf08
Compare
Treat transient SKILL.md stat/read failures as transport errors instead of invalid skill diagnostics, then reuse the last good diagnostics snapshot for list/listDiagnostics when discovery fails transiently. - add AgentSkillTransientDiscoveryError classification in agentSkillsService - add ORPC diagnostics cache + transient fallback helper - route both agentSkills.list and listDiagnostics through shared diagnostics loader - add service and cache tests for transient and non-transient failure paths --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.00`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=0.00 -->
4faaf08 to
cfaafd4
Compare
|
@codex review Applied the requested fix:
Local validation:
|
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Summary
Handle transient skill discovery transport failures without surfacing false
SKILL.md is missing or unreadablediagnostics in SSH workspaces.Background
Skill discovery currently treats all
stat/readfailures as deterministic file problems. On flaky SSH links this produced misleading invalid-skill diagnostics for valid skills and causedagentSkills.list/agentSkills.listDiagnosticsto diverge from expected behavior.Implementation
AgentSkillTransientDiscoveryErrorand transient error pattern classification inagentSkillsService.src/node/orpc/agentSkillsDiagnosticsCache.tswith:router.tsso bothagentSkills.listandagentSkills.listDiagnosticsshare the fallback-backed diagnostics loader.Validation
make static-checkbun test src/node/services/agentSkills/agentSkillsService.test.ts src/node/orpc/agentSkillsDiagnosticsCache.test.tsRisks
Low to medium. The transient classifier is message-pattern based, so misclassification is possible if transport/runtime error strings change. The fallback is scoped per discovery context and only applies to explicit transient error type, minimizing risk of masking deterministic skill issues.
Generated with
mux• Model:openai:gpt-5.3-codex• Thinking:xhigh• Cost:$0.00