Summary
The chat currently waits for full LLM response before displaying. Implementing streaming would show tokens in real-time (like ChatGPT), improving UX significantly.
Current vs Proposed
| Current |
Proposed |
| Wait 5-10 seconds |
Response appears in 1 second |
| Full text at once |
Words appear as generated |
| Feels slow |
Feels instant |
Acceptance Criteria
I am ready to implement this feature.
/cc @visakh