Skip to content

Conversation

dirkpetersen
Copy link

🎤 feat: Improve speech-to-text with configurable silence timeout and text accumulation

Summary

  • Add configurable silence timeout (1-15s, default 8s) to prevent premature recording stops during thinking pauses
  • Implement text accumulation across speech recognition sessions to preserve previously spoken text
  • Add SilenceTimeoutSelector component with slider control in advanced speech settings
  • Enhance browser STT to accumulate text instead of replacing on each recognition cycle
  • Modify external STT to use configurable timeout instead of hardcoded 3-second limit
  • Add double-click functionality to microphone button for manual text clearing
  • Include clearAccumulatedText() methods for both browser and external STT implementations
  • Add localization strings for silence timeout and speech text cleared messages
  • Preserve accumulated text until successful message submission or manual clear
  • Added test cases and docs

This resolves the issue where speech-to-text would delete previous text after pauses, allowing users to think while speaking without losing their words.

🤖 Generated with Claude Code

Change Type

Please delete any irrelevant options.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Translation update

Dirk Petersen and others added 5 commits August 16, 2025 07:56
…text accumulation

- Add configurable silence timeout (1-15s, default 8s) to prevent premature recording stops during thinking pauses
- Implement text accumulation across speech recognition sessions to preserve previously spoken text
- Add SilenceTimeoutSelector component with slider control in advanced speech settings
- Enhance browser STT to accumulate text instead of replacing on each recognition cycle
- Modify external STT to use configurable timeout instead of hardcoded 3-second limit
- Add double-click functionality to microphone button for manual text clearing
- Include clearAccumulatedText() methods for both browser and external STT implementations
- Add localization strings for silence timeout and speech text cleared messages
- Preserve accumulated text until successful message submission or manual clear

This resolves the issue where speech-to-text would delete previous text after pauses, allowing users to think while speaking without losing their words.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add tests for new SilenceTimeoutSelector component with slider functionality
- Add tests for updated useSpeechToTextBrowser hook with text accumulation logic
- Add tests for updated useSpeechToTextExternal hook with configurable timeout
- Add tests for enhanced AudioRecorder component with double-click clear functionality
- Add tests for new silenceTimeoutMs store setting and existing speech settings
- Add integration tests for Speech settings UI with advanced mode interactions
- Ensure test coverage for all new features: configurable silence timeout, text accumulation, and manual clearing

Tests cover component rendering, user interactions, state management, accessibility, and integration scenarios to ensure robust functionality of the speech-to-text improvements.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Update README.md with new speech features: configurable silence detection, text accumulation, and manual clearing
- Add CHANGELOG.md entry for the speech-to-text improvements in unreleased section
- Create SPEECH_FEATURES.md with comprehensive documentation covering:
  - Feature descriptions and usage instructions
  - Technical implementation details for both browser and external STT
  - Configuration options and accessibility features
  - Testing coverage and migration notes
  - Backwards compatibility information

Documentation provides clear guidance for users and developers on the enhanced speech-to-text functionality that addresses text deletion during thinking pauses.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Major bug fixes:
- Fix text accumulation logic that was replacing instead of appending text
- Fix accumulated text being cleared when toggling recording sessions
- Prevent concurrent recording sessions

Performance improvements:
- Optimize silence detection from 60Hz to 10Hz (83% CPU reduction)
- Improve resource management with stream reuse
- Add throttling and debouncing for better efficiency

Enhanced error handling:
- Add comprehensive permission error handling with specific messages
- Implement network error recovery with retry logic
- Add offline state detection and handling

User experience improvements:
- Add mobile double-tap support with proper debouncing
- Preserve accumulated text across recording sessions
- Provide clear, actionable error messages

Testing and code quality:
- Fix test file extensions for JSX support
- Add comprehensive edge case test coverage
- Add detailed code documentation

This resolves the issue where speech-to-text would delete previous text after pauses, significantly improving the user experience.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove CLAUDE.md from version control (keep local copy)
- Add CLAUDE.md to .gitignore to prevent future tracking
- This file contains project-specific Claude Code configuration
Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESLint found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@dustinhealy
Copy link
Collaborator

Hi Dirk,

Thank you for your contribution! Could you please resolve the outstanding ESLint issues before I review?

@dirkpetersen
Copy link
Author

Thanks Dustin, hope i have time to work on this on the weekend. Allowing me to think while speaking without losing my words is quite important to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants