Skip to content

📊 Context & Token Management #317

@vasconceloscezar

Description

@vasconceloscezar

Summary

Provide visibility and control over LLM context window usage. Users currently get stuck when context fills up with no warning or recovery options.

Problems Identified

1. No Context Usage Visibility

Current: No indication of how much context window is used.
Impact: Users hit limits unexpectedly, conversations break.

Solution:

  • Context percentage indicator in chat header
  • Visual progress bar (green → yellow → red)
  • Token count display (used / max)

2. No Warning When Approaching Limit

Current: Context fills silently until failure.
Impact: Abrupt failures, lost conversation flow.

Solution:

  • Warning at 80% usage: "Context getting full"
  • Strong warning at 95%: "Consider compacting"
  • Suggested actions with each warning

3. No Auto-Compact Option

Current: No automatic handling of full context.
Impact: Users must manually manage or start over.

Solution:

  • Auto-compact toggle in settings
  • Configurable threshold (e.g., compact at 90%)
  • Smart compaction that preserves key context

4. No Manual Compact Trigger

Current: Can't manually compact conversation.
Impact: No way to free up context mid-conversation.

Solution:

  • "Compact Conversation" button
  • Preview of what will be summarized
  • Option to exclude specific messages from compaction

5. Getting Stuck With No Recovery

Current: When context is full, user is stuck.
Impact: Have to abandon conversation, lose context.

Solution:

  • Clear error message explaining the situation
  • One-click "Compact and Continue" action
  • Option to export conversation before compacting
  • "Start Fresh with Summary" option

Acceptance Criteria

  • Context usage percentage visible in chat
  • Warning appears when context approaches limit
  • Auto-compact option available in settings
  • Manual compact button works and shows preview
  • Clear recovery path when context is full

Technical Notes

Token Counting:

  • Use tiktoken or similar for accurate counts
  • Cache token counts per message
  • Consider streaming token updates

Compaction Strategy:

  • Summarize older messages
  • Preserve system prompts and recent context
  • Keep code blocks and important decisions

UI Components:

  • Context meter component
  • Compact dialog with preview
  • Settings toggle for auto-compact

Team Feedback Sources

  • Feedback 6: Context %, warnings, auto-compact, getting stuck

Priority

🟡 P1 - High (conversation-breaking issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions