Skip to content

feat(tui): add tokens per second to response footer#12721

Open
JohnC0de wants to merge 3 commits intoanomalyco:devfrom
JohnC0de:feat/tokens-per-second-display
Open

feat(tui): add tokens per second to response footer#12721
JohnC0de wants to merge 3 commits intoanomalyco:devfrom
JohnC0de:feat/tokens-per-second-display

Conversation

@JohnC0de
Copy link

@JohnC0de JohnC0de commented Feb 8, 2026

Fixes #5374
Closes #6096

Adds a tok/s (TPS) counter to assistant message footers. Shows up right after duration, like: 18.3s · 131 tok/s

Why

I've been switching between providers a lot lately and wanted a quick way to see which models are actually fast vs which just feel fast. Kimi K2.5 clocks ~130 tok/s. Having the number right there makes the difference obvious without needing external tooling.

Screenshot

TPS in action with Kimi K2.5

Kimi K2.5 Free hitting 198 tok/s on a real response

Prior art

#5497 by @edlsh tackled this back in December. It's been sitting for 2+ months now with merge conflicts and CI failures, and a few people in the comments are asking for it to land. Rather than try to rebase that PR, I reimplemented it cleanly on current dev with a different structure: TPS logic lives in core/tokens/ instead of tui/util/ so the SDK and other consumers can use it later without pulling in TUI code.

How it works

processor.ts records a firstToken timestamp when the first output-delta arrives during streaming. TPS is then calculated as generatedTokens / ((completed - firstToken) / 1000), where generatedTokens includes both output and reasoning tokens. Responses shorter than 250ms, tool calls, and errored responses are filtered out.

What I left out

Average/aggregate TPS across a session. Both issues mention it but it felt like scope creep for a first pass. The per-message timestamps are all persisted, so adding a session-level summary later is straightforward.

Testing

34 unit tests cover calculation, edge cases, and filtering. All CI checks pass: typecheck, unit, e2e (linux), pr-standards.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 8, 2026

The following comment was made by an LLM, it may be inaccurate:

Potential Duplicate Found:

PR #5497 - "feat: display tokens per second for assistant messages"
#5497

Why it's related: This PR appears to be addressing the exact same feature - displaying tokens per second for assistant messages. It likely covers similar functionality for tracking and displaying TPS metrics in the UI.

@JohnC0de JohnC0de force-pushed the feat/tokens-per-second-display branch from 787aee0 to c54f23a Compare February 8, 2026 17:23
Adds TPS calculation and display to message footers. Tracks firstToken
timestamp during streaming and calculates throughput for completed text
responses. Filters out tool calls and fast responses to avoid noise.

Key features:
- Shows TPS next to duration: "3.4s · 45 tok/s"
- Includes both output and reasoning tokens
- 250ms minimum threshold to filter noise
- Comprehensive test coverage (34 tests)

Tested with Kimi K2.5 showing ~131 tok/s.

Fixes anomalyco#5374, Closes anomalyco#6096
@JohnC0de JohnC0de force-pushed the feat/tokens-per-second-display branch from c54f23a to 571c49b Compare February 8, 2026 17:30
@JohnC0de JohnC0de changed the title feat(tui): display tokens/second metric for assistant responses feat: show tokens per second Feb 8, 2026
@JohnC0de JohnC0de changed the title feat: show tokens per second feat(tui): add tokens per second to response footer Feb 8, 2026
@JohnC0de
Copy link
Author

JohnC0de commented Feb 8, 2026

@adamdotdevin @rekram1-node — the bot flagged this as a duplicate of #5497, so wanted to give some context.

I reviewed #5497 before starting. It has merge conflicts against dev and failing CI, and @edlsh hasn't been active on it since December. A few people in the comments there are asking for it to land. Rather than try to rebase someone else's branch, I reimplemented it cleanly on current dev with a different structure: TPS calculation lives in core/tokens/ instead of tui/util/ so the SDK and non-TUI consumers can reuse it.

Quick review guide if it helps:

  • tps.ts (83 lines) — the whole calculation. Pure functions, no side effects
  • processor.ts — only change is recording time.firstToken when the first output-delta arrives
  • session/index.tsx — swaps the old inline calculation for getMessageTPS() + formatTPS()
  • The rest (message-v2.ts, types.gen.ts, openapi.json) is schema + SDK regen for the new firstToken field

Happy to adjust anything.

@KohliNaman
Copy link

KohliNaman commented Feb 8, 2026

i was just looking for this, checked it out on macos, works flawlessly.... Thanks a ton!
image

@Daltonganger
Copy link

Any update on this?

@Daltonganger
Copy link

Daltonganger commented Feb 16, 2026

@rekram1-node I investigated the 3 failing checks on this PR.

Root cause:

  • e2e (windows) fails because Bun.which("rg") can resolve to an invalid POSIX-style path on Windows.
  • e2e (linux) fails because Bun.which("rg") can return a path that exists but is not spawnable (ENOENT at runtime).
  • test (linux) is a gate job and fails because upstream e2e jobs fail.

Proposed minimal fix (single-file change): packages/opencode/src/file/ripgrep.ts

  • Ignore POSIX rg paths on Windows.
  • Probe rg --version before trusting a resolved binary path; fallback to bundled/downloaded rg when unusable.

I can paste the exact patch here if useful.

@Daltonganger
Copy link

@rekram1-node I opened a follow-up PR that includes all changes from this PR plus a minimal ripgrep path fix for the failing checks:
#13892

Cross-reference:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Adding Experimental Calculation and Display of Tokens per second [FEATURE]: show tokens / second

3 participants