Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Session Startup by Parallelizing Component Initialization #904

Open
golbin opened this issue Dec 22, 2024 · 2 comments
Open

Optimize Session Startup by Parallelizing Component Initialization #904

golbin opened this issue Dec 22, 2024 · 2 comments
Assignees

Comments

@golbin
Copy link
Contributor

golbin commented Dec 22, 2024

Description

Is this reporting a bug or feature request?
This is a feature request to optimize the session startup process.

Issue description

The current session startup process initializes components (TTS, LLM, STT, and Daily Join) sequentially, which results in higher latency before the session becomes operational. By refactoring the process to initialize these components in parallel, the startup time can be significantly reduced.

Expected behavior

The initialization of TTS, LLM, STT, and Daily Join should occur in parallel, resulting in a noticeable reduction in session startup time.

@aconchillo aconchillo self-assigned this Dec 22, 2024
@aconchillo
Copy link
Contributor

aconchillo commented Dec 22, 2024

Thank you for reporting this @golbin . The reason behind the current behavior is to make sure all the components are successfully initialzed in order so we don't push frames prematurely. But yes, maybe we could possibly initialize everything in parallel and just wait for the last service being initialized before starting pushing frames.

@golbin
Copy link
Contributor Author

golbin commented Dec 23, 2024

Always thank you @aconchillo 🙂

Yes, handling parallel processing might be a bit challenging, but it could significantly reduce the startup time.

If we can reuse a bot process, the startup could become extremely fast. Additionally, if we can maintain connections to TTS and STT even after the session ends, we could potentially onboard new users with almost no waiting time.

For a simpler approach, we could start by parallelizing only the Daily transport and joining. This alone could substantially reduce the waiting time.

Time Event Time Difference
2024-12-21 10:36:27.580 Modules: Session started 0.001 seconds
2024-12-21 10:36:27.580 VAD: Set VAD parameters 0.000 seconds
2024-12-21 10:36:27.649 Audio: Silero VAD loaded 0.069 seconds
2024-12-21 10:36:27.684 Processors: Link frame processors 0.035 seconds
2024-12-21 10:36:27.909 Services: Connect to Cartesia 0.225 seconds
2024-12-21 10:36:28.178 Daily: Join session 0.269 seconds
2024-12-21 10:36:28.616 Daily: Joined session 0.438 seconds
2024-12-21 10:36:29.042 Modules: Sync session state 0.426 seconds
2024-12-21 10:36:29.043 Modules: Update session state 0.001 seconds
2024-12-21 10:36:30.132 Daily: Participant joined 1.089 seconds
2024-12-21 10:36:30.133 Modules: Participant joined session 0.001 seconds
2024-12-21 10:36:30.334 Pipeline: Start session with greeting 0.201 seconds
2024-12-21 10:36:30.335 Cartesia: Generate TTS for greeting 0.001 seconds
2024-12-21 10:36:30.508 Bot: Started speaking 0.173 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants