feat: add turn-level prompt caching + PostHog metrics + live API tests by sestinj · Pull Request #10801 · continuedev/continue

sestinj · 2026-02-25T01:42:07Z

Summary

Turn-level caching: The systemAndTools caching strategy now calls addCacheControlToLastTwoUserMessages() to cache conversation turns, not just the system+tools prefix. This matches what the VS Code extension path already does.
PostHog cache metrics: recordStreamTelemetry now reports prompt_cache_metrics events with cache_read_tokens, cache_write_tokens, total_prompt_tokens, cache_hit_rate, and tool_count.
Live API tests: 13 tests across 2 files validate caching end-to-end against the real Anthropic API (guarded by ANTHROPIC_API_KEY env var, skipped in CI).

Impact of turn-level caching (validated by temporarily reverting the fix)

Scenario	Without fix	With fix	Improvement
Basic 3-turn conversation	93%	99.4%	+6%
Tool use follow-up	81.9%	99.5%	+18%
Parallel tool calls	83.5%	99.4%	+16%
Long conversation (8 turns)	77.5%	94.0%	+17%
Large tool result (~200 lines)	62.3%	99.4%	+37%

Without the fix, cache hit rates degrade as conversations grow because only the static system+tools prefix is cached. With turn-level caching, the last two user messages are also cached, keeping rates at 94-99%.

Test plan

All 25 existing AnthropicCachingStrategies.test.ts unit tests pass (+ 2 new)
3 live API tests in anthropic-caching.live.test.ts pass
10 battle test scenarios in anthropic-caching-scenarios.live.test.ts pass
Validated fix impact by temporarily reverting and comparing hit rates

🤖 Generated with Claude Code

Continue Tasks: ❌ 7 failed — View all

Summary by cubic

Adds turn-level prompt caching to the Anthropic systemAndTools strategy, records prompt cache metrics to PostHog, and adds live Anthropic API tests to keep cache hit rates high in multi-turn chats. Also makes the caching transform non‑mutating, fixes minor test/telemetry issues, and ensures live API tests are excluded from CI.

New Features
- Turn-level caching: caches the last two user messages (plus system + tools).
- PostHog telemetry: records prompt_cache_metrics (cache_read/write_tokens, total_prompt_tokens, cache_hit_rate, tool_count).
- Live API tests: end-to-end against Anthropic (guarded by ANTHROPIC_API_KEY, skipped in CI) covering tool use, parallel calls, long chats, large tool results, and cache invalidation.
Bug Fixes
- Non-mutating transform: clone message content blocks before adding cache_control.
- Telemetry: use void posthogService.capture to avoid unhandled async.
- Tests: fix duplicate assistant messages in Scenario 3; add unit test asserting no input mutation; update adapter tests to expect cache_control on user content; add casts for Anthropic cache_read_tokens fields.
- CI/Vitest: exclude *.live.test.ts and preserve default excludes via configDefaults.exclude.

^{Written for commit a754a33. Summary will update on new commits.}

…he metrics + live API test - Add addCacheControlToLastTwoUserMessages call in systemAndToolsStrategy so the CLI path caches conversation turns (matching VS Code extension behavior) - Report prompt_cache_metrics to PostHog with cache_read/write tokens and hit rate - Add live API integration test validating cache writes on turn 1 and 99%+ hit rates on subsequent turns (guarded by ANTHROPIC_API_KEY env var) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

7 distinct scenarios covering tool use round-trips, parallel tool calls, long 8-turn conversations, large tool results (~200 lines), cache invalidation on system message changes, identical request replays, and multi-step agentic workflows with chained tool calls. All scenarios validate cache hit rates >90% where expected and proper cache misses when the system message changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cubic-dev-ai

5 issues found across 5 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="extensions/cli/src/stream/streamChatResponse.helpers.ts">

<violation number="1" location="extensions/cli/src/stream/streamChatResponse.helpers.ts:391">
P2: Missing `await` or `void` for an asynchronous function call.</violation>
</file>

<file name="packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts">

<violation number="1" location="packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts:213">
P1: The turn-level caching strategy allows the total number of `cache_control` breakpoints to exceed Anthropic's maximum limit of 4. If system and tools use 3-4 breakpoints, adding 2 more in user messages will exceed the limit. `addCacheControlToLastTwoUserMessages` must respect the `availableCacheMessages` counter.</violation>

<violation number="2" location="packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts:242">
P2: `systemAndTools` now mutates the original `body.messages` array in place. Since `result` is only a shallow copy, `addCacheControlToLastTwoUserMessages(result.messages)` modifies the original input object. The `messages` array should be mapped or deep-cloned before modification.</violation>

<violation number="3" location="packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts:292">
P2: Turn-level caching silently fails for string content messages. Instead of explicitly skipping them, the underlying implementation should convert string messages to an array format (with the `cache_control` block) so they can benefit from caching.</violation>
</file>

<file name="packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts">

<violation number="1" location="packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts:940">
P2: Duplicate assistant messages are pushed in the simulated conversation loop, resulting in two consecutive assistant messages per turn.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts

extensions/cli/src/stream/streamChatResponse.helpers.ts

packages/openai-adapters/src/apis/AnthropicCachingStrategies.test.ts

packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts

The OpenAI PromptTokensDetails type doesn't include cache_read_tokens (it's an Anthropic extension). Add 'as any' casts to fix TypeScript build errors in the battle test file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…essages The systemAndTools caching strategy now adds cache_control to the last two user messages via addCacheControlToLastTwoUserMessages(). Update existing test expectations to include the cache_control field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Clone message content blocks before mutating for cache_control (prevents mutation of original input body) - Add `void` prefix to async posthogService.capture call - Fix duplicate assistant messages in Scenario 3 test loop - Add unit test verifying original body is not mutated Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts">

<violation number="1" location="packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts:956">
P2: Restoring the `exchanges[i + 1]` fallback prevents dead code and improves test resilience.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts

Live test files (*.live.test.ts) make real Anthropic API calls and are intended for manual validation only. Exclude them from the default vitest configuration to prevent flaky CI failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/openai-adapters/vitest.config.ts">

<violation number="1" location="packages/openai-adapters/vitest.config.ts:9">
P2: Overriding Vitest's `exclude` drops standard default ignores (like `**/dist/**`). Import `configDefaults` from `vitest/config` and spread `configDefaults.exclude` instead.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

packages/openai-adapters/vitest.config.ts

Preserve Vitest's default excludes (dist, node_modules, etc.) when adding the live test exclusion pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sestinj and others added 2 commits February 24, 2026 17:28

sestinj requested a review from a team as a code owner February 25, 2026 01:42

sestinj requested review from Patrick-Erichsen and removed request for a team February 25, 2026 01:42

github-project-automation bot moved this to Todo in Issues and PRs Feb 25, 2026

github-project-automation bot added this to Issues and PRs Feb 25, 2026

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 25, 2026

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

sestinj and others added 3 commits February 24, 2026 17:55

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

packages/openai-adapters/src/test/anthropic-caching-scenarios.live.test.ts Show resolved Hide resolved

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

packages/openai-adapters/vitest.config.ts Outdated Show resolved Hide resolved

fix: spread configDefaults.exclude in vitest config

a754a33

Preserve Vitest's default excludes (dist, node_modules, etc.) when adding the live test exclusion pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add turn-level prompt caching + PostHog metrics + live API tests#10801

feat: add turn-level prompt caching + PostHog metrics + live API tests#10801
sestinj wants to merge 7 commits intomainfrom
nate/prompt-caching

sestinj commented Feb 25, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sestinj commented Feb 25, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Impact of turn-level caching (validated by temporarily reverting the fix)

Test plan

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sestinj commented Feb 25, 2026 •

edited by cubic-dev-ai bot

Loading