-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Summary
Improve session discovery and search by adding a persistent index layer, leveraging Claude Code's pre-computed session index, and extracting richer metadata from JSONL files.
This spec is intentionally implementation-agnostic — it describes what should change, not how to wire it into the current codebase.
1. Full-Text Search Index
Problem
The current fullText search tier performs line-by-line regex on live JSONL files. This is slow for large session histories (1000+ sessions) and doesn't support ranked results.
Requirements
- Index session content into a persistent store with full-text search capability (e.g. SQLite FTS5, or similar)
- Indexed fields:
first_prompt,summary, user message text - Search should return ranked results (relevance scoring)
- Index must stay in sync as sessions are created/updated — either via the existing file watcher or periodic rescan
- Incremental updates: only re-index sessions whose files have changed (mtime-based skip)
- Index should be rebuildable from scratch if corrupted or deleted
- Store index in
~/.freshell/alongside existing config
Behavior
- Title-tier search continues to work as today (fast client-side filter)
userMessagesandfullTexttiers hit the index instead of scanning files- Search results include match snippets (the line/context that matched)
- Partial results should be clearly indicated if indexing is still in progress
2. sessions-index.json Fast Path
Problem
On startup (and rescan), every JSONL file is opened and partially parsed to extract metadata. Claude Code maintains a sessions-index.json file that contains pre-computed session metadata, but we don't use it.
Requirements
- On startup/rescan, check for
~/.claude/projects/*/sessions-index.json - If present and newer than our cached state, read session metadata from it instead of parsing individual JSONL files
- Fall back to JSONL parsing for sessions not present in the index (e.g. other providers, or if the index is stale/missing)
- Document the expected schema of
sessions-index.jsonbased on what Claude Code actually writes
Behavior
- Startup time should improve significantly for users with many sessions
- No user-visible behavior change — same session list, same metadata
- If
sessions-index.jsonis malformed or missing, degrade gracefully to current behavior
3. Richer Metadata Extraction
Problem
We currently extract title, summary, message count, cwd, and timestamps from JSONL files. There's more useful metadata available that would improve browsing and filtering.
Requirements
Token usage:
- Parse
usageobjects from assistant messages (input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens) - Aggregate totals per session
- Display in session cards and detail views
- Useful for cost awareness and identifying expensive sessions
Git branch:
- Extract git branch from session metadata (typically in early system/config entries)
- Display as a badge on session cards
- Enable filtering/grouping by branch
Model info:
- Extract which model(s) were used during the session
- Display formatted model name (strip
claude-prefix and date suffix, e.g.claude-opus-4-5-20251101→opus-4-5)
Duration:
- Compute session duration from first to last entry timestamps
- Display as human-readable duration
Tool usage summary:
- Count tool invocations by type (Read, Write, Edit, Bash, Grep, etc.)
- Useful for understanding what a session did at a glance (e.g. "heavy editing session" vs "mostly research")
Behavior
- All new metadata fields are optional — sessions missing them display gracefully
- Metadata is extracted during the existing parse pass (or from the FTS index if implemented)
- New fields available in search results and session cards
- No new API endpoints required — extend existing session metadata shape
Non-Goals (for this spec)
- Session transcript viewer (full conversation rendering in a pane) — separate feature, builds on top of this indexing work
- Conversation tree/branching model — separate feature
- Conversation minimap — separate feature
- UI redesign of HistoryView — separate; this spec focuses on the data layer
Inspiration
claude-session-viewer uses SQLite + FTS5 with mtime-based skip optimization and a sessions-index.json fast path. Their metadata includes token usage, git branch, model name, and subagent counts. Worth referencing for schema decisions.
Open Questions
- Should the FTS index live in SQLite (proven, claude-session-viewer uses it) or something lighter (e.g. MiniSearch in-memory with serialization)?
- How much of user message content should be indexed? Full text vs first N characters?
- Should token costs be estimated in dollars (requires model pricing lookup) or just shown as raw token counts?