Handle tool-call streaming loops and strengthen model cache validation by christopherkarani · Pull Request #23 · christopherkarani/Conduit

christopherkarani · 2026-02-19T19:48:59Z

Summary

Introduce recursive streaming loop in ChatSession that handles tool calls, executor retries, and proper state updates while cancelling and error handling are improved
Guard PartialToolCall construction with validation that logs and skips invalid fragments across OpenAI streaming helpers
Harden ModelCache lookups to verify disk paths and prune stale entries plus enrich model download progress with speed/ETA tracking

Testing

Not run (not requested)

chatgpt-codex-connector · 2026-02-19T19:49:06Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

claude · 2026-02-19T19:52:13Z

Code Review

Thanks for these improvements. The overall direction is good — the streaming loop is significantly cleaner than the old one-shot approach and the disk-validation in ModelCache addresses a real class of bugs. A few issues worth addressing before merge:

Bugs / Correctness

1. allCachedModels() is inconsistent with the validated accessors (ModelCache.swift)

isCached(_:), info(for:), and localPath(for:) now call validateCachedEntry and prune stale entries, but allCachedModels() still returns cache.values directly. A caller who iterates allCachedModels() can get entries that isCached() would report as absent:

let models = await cache.allCachedModels()   // may include stale entries
for m in models {
    await cache.isCached(m.identifier)       // returns false ← mismatch
}

allCachedModels() should filter through the same validation logic (or at least document the divergence).

2. Double progress callback per event (ModelManager.swift)

enrichedProgress calls callback(enriched) immediately (no speed/ETA), then yields to speedTask which calls progress(updated) again with speed info. The user's closure fires twice per tick. This was the same behaviour as the old code, but the refactor was an opportunity to fix it — callers expecting one call per event will see two.

Design / Code Quality

3. Redundant continuation variables (ModelManager.swift, lines ~696-704)

Three variables are declared for what is ultimately the same value:

var speedContinuation: AsyncStream<DownloadProgress>.Continuation?
var speedContinuationRef: AsyncStream<DownloadProgress>.Continuation?
// ...
speedContinuationRef = speedContinuation            // ← copy
let continuationSnapshot = speedContinuationRef     // ← another copy

speedContinuation is only ever used to assign speedContinuationRef. This can be collapsed into a single var speedContinuation captured by the closure. The intermediate continuationSnapshot inside the task body is also unnecessary since the outer speedContinuationRef is never mutated after assignment.

4. Redundant continuationSnapshot?.finish() inside speedTask (ModelManager.swift)

The for await loop inside speedTask exits only after the stream finishes (i.e., after the continuation is already finished). Calling continuationSnapshot?.finish() at that point is a no-op. The real cleanup is already handled by the defer.

5. PartialToolCall.init is a source-breaking public API change (GenerationChunk.swift)

Changing a non-throwing public init to throws is a breaking change for any downstream consumer who constructs PartialToolCall directly (e.g., custom provider implementations, unit tests). If this library follows semantic versioning this warrants a major bump or, at minimum, clear documentation in the changelog/migration guide.

The call sites in OpenAIProvider+Streaming.swift use try? to silently skip invalid fragments. Consider whether swallowing the error here is always correct — a misconfigured accumulator could silently drop tool calls with no indication to the caller. A logger.error (vs logger.warning) may be more appropriate depending on how severe an empty id / toolName is.

Minor

6. Tool-round counting semantics (ChatSession.swift)

The guard toolRoundCount < currentMaxToolCallRounds is evaluated before executing the current round. This means maxToolCallRounds = 0 throws immediately on the first tool call without executing it, and maxToolCallRounds = N allows exactly N rounds. This is correct but non-obvious — a doc comment on the property clarifying inclusive/exclusive semantics would prevent future confusion.

7. No new tests

The tool-call streaming loop is the most complex change here. The existing ChatSessionTests has the mock infrastructure (MockTextProvider) to cover:

Exactly N rounds of tool calls completing successfully
The maxToolCallRounds overflow path throwing the right error
toolExecutor == nil when tool calls are returned
Cancellation mid-loop not leaving partial state in messages

These are all high-value cases that would also catch regressions in the loopMessages / turnMessages state management.

Summary

Area	Severity
`allCachedModels()` stale entries	Medium
Double progress callback	Low–Medium
Redundant continuation variables	Low
`PartialToolCall` breaking change	Medium (API contract)
Missing tests for streaming loop	Medium

The core logic in all three areas is sound. Addressing the allCachedModels consistency issue and adding tests for the tool-call loop are the most important follow-ups before merging.

christopherkarani · 2026-02-19T19:59:03Z

@claude review and fix all identified issues

claude · 2026-02-19T19:59:16Z

Claude finished @christopherkarani's task in 9m 28s —— View job

All Identified Issues Fixed

All seven issues from the previous review have been addressed in commit 21cf27d.

Tasks

Gather context and read changed files
Fix allCachedModels() consistency with validated accessors (ModelCache.swift)
Fix double progress callback + clean up redundant continuation variables (ModelManager.swift)
Restore non-throwing PartialToolCall.init, add throwing validated factory (GenerationChunk.swift)
Update OpenAI streaming call sites to use PartialToolCall.validated, upgrade log levels (OpenAIProvider+Streaming.swift)
Add doc comment clarifying maxToolCallRounds inclusive semantics (ChatSession.swift)
Add streaming loop tests + update MockTextProvider (ChatSessionTests.swift)
Commit and push changes

What was changed

ModelCache.swift — allCachedModels() consistency (bug fix)

allCachedModels() now snapshots cache.values into an Array first, then filters each entry through validateCachedEntry. This prunes stale entries the same way isCached, info(for:), and localPath(for:) do, so callers can no longer get an entry from allCachedModels() that isCached() would report as absent.

ModelManager.swift — Double callback + redundant vars (bug fix + cleanup)

downloadWithEstimation had three variables all holding the same continuation (speedContinuation, speedContinuationRef, continuationSnapshot/speedContinuationSnapshot). These were collapsed into one speedContinuation plus a single capturedContinuation snapshot for the @Sendable closure. The immediate callback(enriched) call was removed from enrichedProgress; the speed-stream task now fires the user callback exactly once per event (whether or not speed data is available yet). The no-op continuationSnapshot?.finish() at the end of the for await body was also removed.

GenerationChunk.swift — Backward-compatible PartialToolCall API (breaking-change fix)

The public init was restored as non-throwing with precondition guards, preserving the original ABI for any downstream consumers (custom provider implementations, unit tests). A new public static func validated(id:toolName:index:argumentsFragment:) throws -> PartialToolCall factory was added for streaming code that may encounter malformed server data — it throws AIError.invalidInput instead of trapping.

OpenAIProvider+Streaming.swift — Use validated factory + upgrade log level

Both call sites now use PartialToolCall.validated(...). The Responses API path was also refactored from try? (which silently dropped errors) to an explicit do/catch. Both sites log at .error level instead of .warning, reflecting that an empty id or toolName in a streaming response indicates a real problem.

ChatSession.swift — maxToolCallRounds doc comment

The doc comment now explicitly states the inclusive semantics:

= 0 → first tool-call response throws immediately (no rounds executed)
= N → exactly N rounds are permitted before throwing

ChatSessionTests.swift — Streaming loop test coverage

MockTextProvider was extended with setQueuedStreamChunkSets(_:) so tests can enqueue per-call [GenerationChunk] arrays for the streaming path (mirroring the existing setQueuedGenerationResults for the generate path). The stream(messages:model:config:) actor method now tracks call counts and received messages the same way generate does.

Four new @Test cases were added to ChatSessionTests:

Test	Scenario
`streamExecutesToolCallsThenYieldsFinalAnswer`	N rounds completing, verifies tokens, message structure, and call count
`streamThrowsWhenToolLoopExceedsMaxRounds`	`maxToolCallRounds` overflow throws `AIError.invalidInput` and rolls back messages
`streamThrowsWhenToolExecutorNilAndToolCallsReturned`	nil `toolExecutor` throws and rolls back messages
`streamMaxToolCallRoundsZeroThrowsImmediately`	`maxToolCallRounds = 0` throws without executing any tool calls
Branch

- ModelCache: filter allCachedModels() through validateCachedEntry so its results are consistent with isCached/info(for:)/localPath(for:) - ModelManager: collapse three redundant speedContinuation variables into one, remove no-op finish() inside speedTask, and fire the user progress callback once per tick (from the speed stream task) instead of twice - GenerationChunk: restore non-throwing public init with precondition guards to preserve backward compatibility; add public static PartialToolCall.validated(...) throwing factory for use in streaming code that receives potentially malformed server data - OpenAIProvider+Streaming: use PartialToolCall.validated at both call sites and upgrade log level from warning to error for invalid fragments - ChatSession: clarify maxToolCallRounds doc comment with inclusive/exclusive semantics (0 = no rounds, N = exactly N rounds) - ChatSessionTests: extend MockTextProvider with setQueuedStreamChunkSets and add four streaming tool-call loop tests covering: N rounds completing successfully, maxToolCallRounds overflow, nil toolExecutor, and maxToolCallRounds = 0 Co-authored-by: Christopher Karani <christopherkarani@users.noreply.github.com>

christopherkarani · 2026-02-20T00:32:34Z

@claude review and fix issues identified

@codex review

claude · 2026-02-20T00:32:45Z

Claude finished @christopherkarani's task in 3m 9s —— View job

Code Review & Fix Pass

Tasks

Gather context and read repository setup
Read all changed files
Review for remaining/new issues
Fix identified issues
Commit and push changes (4982f22)

Issues Found & Fixed

1. Build-breaking: ToolCallRetryPolicy is undefined (ChatSession.swift:667)

The stream() method's capturedState tuple annotation used the type ToolCallRetryPolicy, which does not exist anywhere in the codebase. The correct type is ToolExecutor.RetryPolicy. This was a compilation error that would prevent the project from building entirely.

// Before (broken — ToolCallRetryPolicy is undefined)
let capturedState: (..., toolCallRetryPolicy: ToolCallRetryPolicy, ...) = withLock { ... }

// After (correct)
let capturedState: (..., toolCallRetryPolicy: ToolExecutor.RetryPolicy, ...) = withLock { ... }

2. maxToolCallRounds not clamped in stream() (ChatSession.swift:672)

send(_:) clamps negative maxToolCallRounds values to 0 via max(0, maxToolCallRounds) before capturing. The stream() method was passing the raw value, producing confusing error messages like "exceeded maxToolCallRounds (-2)" for negative inputs.

// After: matches send(_:) behaviour
return (messages, toolExecutor, toolCallRetryPolicy, max(0, maxToolCallRounds))

3. totalSize(), count, isEmpty ignored stale entries (ModelCache.swift)

These three accessors read cache.values / cache.count / cache.isEmpty directly without validating on-disk existence, while allCachedModels(), isCached(), info(for:), and localPath(for:) all call validateCachedEntry. This meant callers could see inflated size/count numbers for models whose files had been removed from disk. All three now route through validateCachedEntry() for consistency.

Summary

Issue	File	Severity	Status
`ToolCallRetryPolicy` undefined (build error)	`ChatSession.swift:667`	Critical	Fixed
`maxToolCallRounds` not clamped in `stream()`	`ChatSession.swift:672`	Low	Fixed
`totalSize()`/`count`/`isEmpty` include stale entries	`ModelCache.swift`	Low–Medium	Fixed
Branch

chatgpt-codex-connector · 2026-02-20T00:32:49Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

…ate totalSize/count - ChatSession.stream(): fix `ToolCallRetryPolicy` (undefined) to the correct `ToolExecutor.RetryPolicy` type in the capturedState tuple annotation, which was a build-breaking error introduced in the previous commit. - ChatSession.stream(): clamp `maxToolCallRounds` to `max(0, ...)` before capturing into the loop, matching the behavior of `send(_:)` and producing a consistent error message for negative values. - ModelCache.totalSize(): filter through validateCachedEntry() before summing sizes, so stale (deleted from disk) entries are excluded and pruned, consistent with isCached/info/localPath/allCachedModels. - ModelCache.count / isEmpty: route through validateCachedEntry() so callers get an accurate count of entries that actually exist on disk, consistent with the rest of the validated accessor family. Co-authored-by: Christopher Karani <christopherkarani@users.noreply.github.com>

Capture tool state during stream

e9a4ff7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Handle tool-call streaming loops and strengthen model cache validation#23

Handle tool-call streaming loops and strengthen model cache validation#23
christopherkarani wants to merge 3 commits intomainfrom
codex/fix-missioncritical-framework-bugs

christopherkarani commented Feb 19, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 19, 2026

Uh oh!

claude bot commented Feb 19, 2026

Uh oh!

christopherkarani commented Feb 19, 2026

Uh oh!

claude bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

christopherkarani commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

christopherkarani commented Feb 19, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 19, 2026

Uh oh!

claude bot commented Feb 19, 2026

Code Review

Bugs / Correctness

Design / Code Quality

Minor

Summary

Uh oh!

christopherkarani commented Feb 19, 2026

Uh oh!

claude bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

All Identified Issues Fixed

Tasks

What was changed

Uh oh!

christopherkarani commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review & Fix Pass

Tasks

Issues Found & Fixed

Summary

Uh oh!

chatgpt-codex-connector bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Feb 19, 2026 •

edited

Loading

claude bot commented Feb 20, 2026 •

edited

Loading