Skip to content

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Jan 18, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Email Summarizer plugin with SMTP configuration for automatic email delivery of conversation summaries.
    • Implemented dynamic plugin discovery and configuration system for easier plugin management.
    • Added external source tracking for improved file deduplication.
  • Bug Fixes

    • Improved error handling for speaker recognition service unavailability.
    • Enhanced diarization settings validation with filtering of unknown parameters.
  • Documentation

    • Added comprehensive plugin development and configuration guides.
    • Added Email Summarizer plugin documentation with setup instructions.
    • Added security guidelines for secret management in configuration.
  • Configuration

    • Updated environment templates with plugin configuration variables.
    • Enhanced MongoDB and plugin configuration setup.

✏️ Tip: You can customize this high-level summary in your review settings.

- Introduced the Email Summarizer Plugin that automatically sends email summaries upon conversation completion.
- Implemented SMTP Email Service for sending emails, supporting HTML and plain text formats with TLS/SSL encryption.
- Added configuration options for SMTP settings in the .env.template and plugins.yml.template.
- Created comprehensive documentation for plugin development and usage, including a new plugin generation script.
- Enhanced testing coverage for the Email Summarizer Plugin and SMTP Email Service to ensure reliability and functionality.
- Removed the static PLUGINS dictionary and replaced it with a dynamic discovery mechanism for plugins.
- Implemented a new setup process for plugins, allowing for configuration via individual setup scripts.
- Added the Email Summarizer plugin with a dedicated setup script for SMTP configuration.
- Enhanced the main setup flow to support community plugins and their configuration.
- Cleaned up unused functions related to plugin configuration and streamlined the overall plugin setup process.
- Updated the .env.template to include new configuration options for the Home Assistant and Email Summarizer plugins, including server URLs, tokens, and additional settings.
- Refactored Docker Compose files to correctly mount plugin configuration paths.
- Introduced comprehensive documentation for plugin configuration architecture, detailing the separation of concerns for orchestration, settings, and secrets.
- Added individual configuration files for the Home Assistant and Email Summarizer plugins, ensuring proper management of non-secret settings and environment variable references.
- Improved the plugin loading process to merge configurations from multiple sources, enhancing flexibility and maintainability.
- Updated the plugin setup script to run interactively, enabling plugins to prompt for user input during configuration.
- Removed output capturing to facilitate real-time interaction and improved error messaging to include exit codes for better debugging.
- Introduced `setup_utils.py` containing functions for reading environment variables, prompting user input, and masking sensitive values.
- Refactored existing code in `wizard.py` and `init.py` to utilize these shared utilities, improving code reuse and maintainability.
- Updated documentation to include usage examples for the new utilities in plugin setup scripts, enhancing developer experience and clarity.
- Introduced a three-file separation for plugin configuration to improve security:
  - `backends/advanced/.env` for secrets (gitignored)
  - `config/plugins.yml` for orchestration with environment variable references
  - `plugins/{plugin_id}/config.yml` for non-secret defaults
- Updated documentation to emphasize the importance of using `${ENV_VAR}` syntax for sensitive data and provided examples of correct usage.
- Enhanced the Email Summarizer plugin setup process to automatically update `config/plugins.yml` with environment variable references, ensuring secrets are not hardcoded.
- Added new fields to the User model for notification email management and improved error logging in user-related functions.
- Refactored audio chunk utilities to use a consistent method for fetching conversation metadata.
- Added a new parameter `transcript_version_id` to the `open_conversation_job` function to support streaming transcript versioning.
- Enhanced error handling in `check_enrolled_speakers_job` and `recognise_speakers_job` to allow conversations to proceed even when the speaker service is unavailable, improving resilience.
- Updated `send_to_adv.py` to support dynamic WebSocket and HTTP protocols based on environment settings, enhancing configuration flexibility.
- Introduced a background task in `send_to_adv.py` to handle incoming messages from the backend, ensuring connection stability and logging interim results.
…nd error handling

- Updated the controller to filter out invalid settings instead of raising an error for each unknown key, allowing for more flexible input.
- Added a check to reject requests with no valid settings provided, enhancing robustness.
- Adjusted logging to reflect the filtered settings being saved.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This PR refactors Chronicle's plugin system and job tracking infrastructure. It introduces a three-file separation for secrets management, replaces audio_uuid tracking with external_source_id/external_source_type for external file deduplication, transitions job management from session-based to client-based identification, and adds a new Email Summarizer plugin with SMTP support and interactive setup wizard. Plugin discovery and configuration loading are now dynamic via shared utilities.

Changes

Cohort / File(s) Summary
Documentation & Configuration Architecture
CLAUDE.md, config/README.md, backends/advanced/Docs/plugin-configuration.md, backends/advanced/docs/plugin-development-guide.md, config/plugins.yml.template, backends/advanced/.env.template
Added comprehensive plugin configuration and security documentation describing three-file architecture (orchestration config, plugin-specific settings, .env secrets), step-by-step plugin creation guide, environment variable expansion, and best practices for secret handling. Updated template files with plugin configuration blocks, environment variable references, and clarifying comments on secret management.
Plugin Infrastructure & Discovery
backends/advanced/services/plugin_service.py, backends/advanced/scripts/create_plugin.py, wizard.py, setup_utils.py
Introduced dynamic plugin discovery via discover_plugins(), configuration loading via load_plugin_config() with env expansion, and new script create_plugin.py for boilerplate generation. Refactored wizard to discover and delegate plugin setup via discover_available_plugins() and run_plugin_setup(). Added shared utilities module setup_utils.py with interactive prompts, env value reading, password masking, and placeholder detection.
Email Summarizer Plugin
backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/*.py, backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/config.yml, backends/advanced/tests/test_email_*.py
New complete plugin with EmailSummarizerPlugin class triggering on conversation.complete events, SMTPEmailService for async email delivery via SMTP, HTML/text email formatting via templates.py, and setup.py wizard for interactive credential collection. Includes comprehensive README and unit test coverage.
Home Assistant Plugin Configuration
backends/advanced/src/advanced_omi_backend/plugins/homeassistant/config.yml
Added plugin-specific configuration file with environment variable references for API endpoint, token, wake word, and timeout settings.
Test Event Plugin Configuration
backends/advanced/src/advanced_omi_backend/plugins/test_event/config.yml
Added configuration for development plugin that logs all plugin events to SQLite.
Client-Based Job Management
backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py, backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py, backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py
Replaced session-based job tracking with client-based identification by renaming all_jobs_complete_for_session()all_jobs_complete_for_client(), updating job metadata to use client_id, and refactoring dashboard endpoints to group by conversation_id instead of session_id. Updated start_post_conversation_jobs() signature to remove audio_uuid parameter and add optional client_id.
Audio Tracking Refactor
backends/advanced/src/advanced_omi_backend/models/conversation.py, backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py, backends/advanced/src/advanced_omi_backend/utils/gdrive_audio_utils.py, backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py
Replaced audio_uuid with external_source_id and external_source_type for gdrive file deduplication. Updated create_conversation() signature and MongoDB index. Modified audio controller to track external sources and pass through new metadata. Changed audio chunk utilities from Conversation.get() to Conversation.find_one().
Worker Job Changes
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py, backends/advanced/src/advanced_omi_backend/workers/transcription_jobs.py, backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py
Removed audio_uuid from job metadata throughout pipelines. Updated transcribe_full_audio_job() signature to remove audio_uuid parameter and increase timeouts to 3 hours. Enhanced speaker job error handling to gracefully skip when speaker service unavailable instead of failing. Added transcript_version_id to downstream jobs.
Conversation Data Model
backends/advanced/src/advanced_omi_backend/models/user.py, backends/advanced/src/advanced_omi_backend/models/conversation.py, backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
Added notification_email field to User model across create/read/update classes. Removed audio_uuid and audio_path fields from conversation responses. Updated reprocess flow to exclude audio_uuid from job metadata.
WebSocket & Local Client
backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py, extras/local-omi-bt/send_to_adv.py
Removed audio_uuid generation in batch processing. Refactored local client to use configurable HTTPS/backend host, added SSL context handling, introduced async receive_handler() for message processing, and updated Wyoming protocol headers with audio format details.
Frontend Type & State Updates
backends/advanced/webui/src/pages/Archive.tsx, backends/advanced/webui/src/pages/Conversations.tsx, backends/advanced/webui/src/pages/ConversationsTimeline.tsx, backends/advanced/webui/src/pages/Queue.tsx
Made conversation_id required (non-optional) on Conversation interface, removed audio_path field. Updated Queue component to replace session-based grouping with conversation-based grouping, renaming state/refs from sessionId to conversationId. Updated all key props and filters to use conversation_id exclusively.
Docker & System Configuration
backends/advanced/docker-compose.yml, backends/advanced/docker-compose-test.yml, backends/advanced/src/advanced_omi_backend/main.py, backends/advanced/src/advanced_omi_backend/controllers/system_controller.py
Added MONGODB_URI environment variable to multiple services. Updated test volume mounts to /app/config/plugins.yml. Modified diarization settings filter to skip unknown keys instead of rejecting request. Minor import formatting.
Authentication & Initialization
backends/advanced/src/advanced_omi_backend/auth.py, backends/advanced/init.py
Added debug logging for existing admin detection. Refactored helper methods to delegate to setup_utils (password prompting, env value reading, masking, existing value detection). Extended setup_obsidian to explicitly disable memory when not enabled.
Testing & Robot Framework
tests/endpoints/audio_upload_tests.robot, tests/endpoints/conversation_tests.robot, tests/integration/plugin_event_tests.robot, tests/resources/audio_keywords.robot, tests/resources/conversation_keywords.robot, tests/resources/plugin_keywords.robot, backends/advanced/tests/test_email_*.py, backends/advanced/src/scripts/cleanup_state.py
Removed audio_uuid assertions from audio upload and conversation tests. Added end_reason verification for conversation.complete events and new WebSocket disconnect event test. Updated keyword parameters from ${audio_uuid} to ${conversation_id}. Added new plugin keywords for event verification. Improved error handling in cleanup_state with per-queue tracking. Added comprehensive Email Summarizer plugin unit tests.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ConversationJob as Conversation Job Worker
    participant PluginRouter as Plugin Router
    participant EmailPlugin as Email Summarizer Plugin
    participant SMTP as SMTP Email Service
    participant MongoDB as MongoDB

    Client->>ConversationJob: trigger conversation.complete
    ConversationJob->>PluginRouter: dispatch_event(conversation.complete)
    PluginRouter->>EmailPlugin: on_conversation_complete(context)
    
    EmailPlugin->>MongoDB: _get_user_email(user_id)
    MongoDB-->>EmailPlugin: notification_email
    
    EmailPlugin->>EmailPlugin: _generate_summary(transcript)
    EmailPlugin->>EmailPlugin: _format_subject(created_at)
    
    EmailPlugin->>SMTP: send_email(to, subject, body_text, body_html)
    SMTP->>SMTP: _send_smtp(msg, to_email)
    SMTP-->>EmailPlugin: success/failure
    
    EmailPlugin-->>PluginRouter: PluginResult
    PluginRouter-->>ConversationJob: result
Loading
sequenceDiagram
    participant PluginService
    participant FileSystem as Plugin Directory
    participant PluginModule as Plugin Module
    participant OrchestrConfig as Orchestration Config
    participant EnvFile as .env File
    participant PluginInstance as Plugin Instance

    PluginService->>FileSystem: discover_plugins()
    FileSystem-->>PluginService: plugin list
    
    loop For each plugin
        PluginService->>PluginModule: import plugin class
        PluginModule-->>PluginService: BasePlugin subclass
        
        PluginService->>OrchestrConfig: read enabled, events, condition
        OrchestrConfig-->>PluginService: orchestration config
        
        PluginService->>PluginService: load_plugin_config(plugin_id, orch_config)
        PluginService->>EnvFile: expand ${ENV_VAR} references
        EnvFile-->>PluginService: expanded config
        
        PluginService->>PluginInstance: instantiate(merged_config)
        PluginInstance-->>PluginService: initialized plugin
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Pre release candidate' is vague and generic, using non-descriptive terminology that does not clearly convey the primary changes in this substantial changeset. Replace with a more specific title reflecting the main change, such as 'Add plugin architecture and email summarizer plugin' or 'Implement dynamic plugin discovery and Email Summarizer plugin'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 95.50% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
backends/advanced/src/advanced_omi_backend/controllers/websocket_controller.py (1)

418-424: Store streaming audio format for accurate finalize/flush.

_finalize_streaming_session reads client_state.stream_audio_format, but it’s never set, so it falls back to defaults. If clients use non-default formats, finalize/flush can mis-handle buffered audio.

🔧 Suggested fix
@@ async def _initialize_streaming_session(...):
-    # Store audio format in Redis session (not in ClientState)
+    # Store audio format in Redis session and keep a copy for finalize/flush
+    client_state.stream_audio_format = audio_format
     from advanced_omi_backend.services.audio_stream.producer import get_audio_stream_producer
@@ async def _finalize_streaming_session(...):
-        audio_format = getattr(client_state, 'stream_audio_format', {})
+        audio_format = getattr(client_state, 'stream_audio_format', {})
@@
-        if hasattr(client_state, 'stream_session_id'):
-            delattr(client_state, 'stream_session_id')
+        for attr in ('stream_session_id', 'stream_audio_format'):
+            if hasattr(client_state, attr):
+                delattr(client_state, attr)

Also applies to: 476-483, 510-513

backends/advanced/webui/src/pages/Queue.tsx (1)

209-212: Stale debug logging references audio_uuid.

The debug logging still references job.meta?.audio_uuid which has been removed from the backend. This won't cause errors (it will just log undefined), but the log statement is misleading.

🔧 Proposed fix to remove stale reference
       openConvJobs.forEach(job => {
         console.log(`  Job ID: ${job.job_id}`);
         console.log(`  Status: ${job.status}`);
-        console.log(`  meta.audio_uuid: ${job.meta?.audio_uuid}`);
         console.log(`  meta.conversation_id: ${job.meta?.conversation_id}`);
       });
backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py (1)

411-507: Fix return value when speaker recognition is disabled.

If speaker_enabled is false, speaker_job stays None, but the return block dereferences speaker_job.id, which will raise AttributeError and abort the pipeline. Return None (or omit the key) and update typing accordingly.

🐛 Proposed fix
-    return {
-        'speaker_recognition': speaker_job.id,
-        'memory': memory_job.id,
-        'title_summary': title_summary_job.id,
-        'event_dispatch': event_dispatch_job.id
-    }
+    return {
+        'speaker_recognition': speaker_job.id if speaker_job else None,
+        'memory': memory_job.id,
+        'title_summary': title_summary_job.id,
+        'event_dispatch': event_dispatch_job.id
+    }
🤖 Fix all issues with AI agents
In `@backends/advanced/.env.template`:
- Around line 104-106: The FROM_NAME value in the dotenv template can be split
by spaces during parsing; update the dotenv template so the FROM_NAME assignment
uses quotes (e.g., FROM_NAME="Chronicle AI") to ensure the entire value is read
as one string; locate the FROM_NAME entry in the dotenv template (the FROM_NAME
variable shown alongside FROM_EMAIL) and surround its value with double quotes.

In `@backends/advanced/Docs/plugin-configuration.md`:
- Around line 72-81: The HA_TOKEN example on the Docs/plugin-configuration.md
contains a JWT-like string that triggers secret scanners; replace the
token-looking example value (the HA_TOKEN entry) with a clearly non-secret
placeholder like "<YOUR_HOME_ASSISTANT_TOKEN>" or "REPLACE_WITH_TOKEN" so the
example is obviously not a real credential and will not be flagged by gitleaks.
- Around line 83-95: The fenced code block under the "Configuration Loading
Process" section in plugin-configuration.md is unlabeled and triggers
markdownlint MD040; add a language identifier (e.g., ```text) to the opening
fence so the block becomes a labeled fenced code block and close it with the
matching backticks, ensuring the block around the four-step list is updated
accordingly.

In `@backends/advanced/docs/plugin-development-guide.md`:
- Around line 41-47: The fenced code block showing the plugin directory tree
(the block that begins with ``` and lines "plugins/my_awesome_plugin/ ...
README.md") is missing a language tag causing MD040; update that fenced block to
use a language tag (use "text") so it starts with ```text and ends with ``` to
satisfy markdownlint and keep the directory tree unchanged.

In `@backends/advanced/scripts/create_plugin.py`:
- Around line 46-53: The script currently skips deletion when --force is set and
simply writes template files into plugin_dir (see plugin_dir.exists(),
plugin_dir.mkdir(...), plugin_name, force), leaving stale files; update the
logic so that when force is true and plugin_dir.exists() you recursively remove
the existing directory (e.g., use shutil.rmtree on plugin_dir) before creating
the fresh plugin_dir, optionally prompting the user for confirmation or honoring
a --yes/--force flag to skip confirmation, then proceed to
plugin_dir.mkdir(parents=True, exist_ok=True) and write the template files.

In `@backends/advanced/src/advanced_omi_backend/auth.py`:
- Around line 227-229: The three debug messages using logger.error that
reference existing_admin/id/user_id should be removed or downgraded to
logger.debug to avoid polluting error logs; update the calls around logger (in
the block that logs existing_admin.id and existing_admin.user_id) to either
delete those logger.error lines entirely or replace them with logger.debug(...)
and consolidate redundant output (omit the duplicate user_id log since
User.user_id returns str(self.id)). Ensure you only change the logging level or
remove the statements and do not alter the underlying existing_admin or User
model behavior.

In `@backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py`:
- Around line 32-36: In generate_client_id replace the error-level PII log with
a debug-level, non-PII message: remove logging of user.email and instead log
only non-sensitive info such as the masked or truncated user id and its type
(e.g., logger.debug with user_id_suffix and type(user.id)) or remove logging
entirely; update the call in generate_client_id so it no longer emits email at
error level and uses logger.debug without PII.

In `@backends/advanced/src/advanced_omi_backend/models/conversation.py`:
- Around line 381-384: The factory is injecting "memories" and "memory_count"
into the dict for the Conversation model, but Conversation has no "memories"
attribute and memory_count is a `@computed_field` (derived from
active_memory.memory_count), so these keys are ignored by Pydantic/Beanie;
either remove the lines setting "memories": [] and "memory_count": 0 from the
factory that constructs Conversation, or if they were intentionally left for
backwards compatibility, add a clear comment in that factory explaining why
these unused keys are preserved (or instead add real fields to the Conversation
model if they must be persisted).

In
`@backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/email_service.py`:
- Around line 113-130: The SMTP connection in _send_smtp is created without a
timeout and can hang; update both places that instantiate the SMTP client (the
smtplib.SMTP(...) calls inside the if self.use_tls and the else branch that
assigns smtp_server) to pass a timeout (e.g., timeout=self.timeout or
timeout=10) so the socket won't block indefinitely—mirror how
_test_smtp_connection uses timeout=10; ensure any SMTP_SSL instantiation
(if/when used) also receives the same timeout.

In
`@backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/README.md`:
- Around line 56-83: The README's Step 3 currently shows SMTP and email options
under config/plugins.yml which conflicts with the three-file config model
described in backends/advanced/Docs/plugin-configuration.md; update the README
and its example to place plugin-specific settings into
plugins/email_summarizer/config.yml (keeping config/plugins.yml only for
orchestration/enabled flag and events/condition), or alternatively update
plugin-configuration.md to document that plugin-specific overrides may live in
config/plugins.yml—ensure the example in README uses the same canonical location
(plugins/email_summarizer/config.yml) and mention where to set only the enabled
flag/events (config/plugins.yml) so the two docs are consistent.
- Around line 110-126: The fenced code blocks in README.md (sections "Subject
Line", "Email Body", and the file-tree under plugins/email_summarizer/) are
missing language identifiers; update each triple-backtick fence around the
example subject, email body, and directory listing to use a language tag (e.g.,
```text) so markdown lint MD040 is satisfied, ensuring you modify the code
fences in the Subject Line and Email Body examples and the
plugins/email_summarizer/ file-tree block (also apply the same change to the
similar blocks around lines referenced in the PR comment).

In
`@backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/templates.py`:
- Around line 55-56: The transcript and summary are inserted into HTML without
escaping, creating XSS risk; update the code in templates.py to HTML-escape
user-generated strings (both transcript and summary) using a safe escaper (e.g.,
html.escape(..., quote=True)) before converting newlines to <br> and before
embedding into the template; ensure you call escape on transcript first, then do
transcript.replace('\n','<br>') to produce transcript_html and likewise escape
summary, and use those escaped variables in the template rendering.

In `@backends/advanced/src/advanced_omi_backend/utils/gdrive_audio_utils.py`:
- Around line 92-96: The dedupe check in gdrive_audio_utils.py only filters by
Conversation.external_source_id and external_source_type, causing cross-user
skips; modify the function that contains this check to accept a user_id param,
include Conversation.user_id == user_id in the find_one filter alongside the
existing conditions, and update the caller in audio_routes.py (where the
function is invoked around line 37) to pass current_user.id (or the appropriate
identifier). Ensure the function signature and all call sites are updated
consistently so deduplication is scoped per user.

In `@backends/advanced/src/scripts/cleanup_state.py`:
- Around line 611-615: The log message currently always states "Cleared total of
{total_jobs} Redis jobs" which can be misleading when some queues failed; update
the logging logic around stats.redis_jobs_count/total_jobs and failed_queues so
that if failed_queues is non-empty you either (a) change the info message to
explicitly say it reflects only successfully cleaned queues (e.g., "Cleared
total of {total_jobs} Redis jobs from successfully cleaned queues") or (b)
suppress the info and replace it with a single warning that includes both
total_jobs and the list failed_queues; ensure the logger.info/logger.warning
calls reference the same total_jobs/stat value and that failed_queues is
included in the warning message to make the partial failure explicit.

In `@config/README.md`:
- Around line 110-122: The fenced code block that documents config files
(containing lines like "config/plugins.yml", "backends/advanced/.env", and
"plugins/{id}/config.yml") is missing a language tag and triggers markdownlint
MD040; update the opening fence from ``` to include a language (e.g., ```text)
so the block becomes ```text and keep the closing ``` as-is to satisfy the
linter.

In `@extras/local-omi-bt/send_to_adv.py`:
- Line 47: The httpx.AsyncClient call in send_to_adv.py currently disables SSL
verification (verify=False); change this to be configurable (e.g., read a
VERIFY_SSL boolean env var or reuse existing SSL context logic used for
WebSocket) and pass the resulting boolean or ssl context into httpx.AsyncClient,
and add a short comment explaining why disabling verification may be acceptable
for self-signed certs in dev but must be enabled in prod; update the AsyncClient
instantiation in the async with block to use that config (instead of
verify=False) and ensure tests or startup logs reflect the configured value.
- Around line 14-24: The code reads BACKEND_HOST and USE_HTTPS before calling
load_dotenv(), so .env values are not applied; move the load_dotenv(env_path)
call to run before reading BACKEND_HOST and USE_HTTPS (i.e., place it above
where BACKEND_HOST and USE_HTTPS are defined) so websocket_uri and backend_url
are built using environment variables loaded from the .env file.

In `@setup_utils.py`:
- Around line 92-98: In mask_value, the length check uses the original value
while the masking uses value_clean (stripped), causing inconsistency for values
with surrounding whitespace; change the code to compute value_clean =
value.strip() first and then use value_clean in the conditional (and subsequent
masking/substring operations) so the length check and masking operate on the
same trimmed string (refer to mask_value and its return calculation).
🧹 Nitpick comments (9)
backends/advanced/src/advanced_omi_backend/controllers/system_controller.py (1)

287-317: Surface ignored keys to avoid silent partial updates.

Skipping unknown keys is fine for round-trip payloads, but silent ignores can hide typos and create partial updates. Consider returning or logging the ignored keys so clients can self-correct.

♻️ Proposed tweak
-        filtered_settings = {}
+        filtered_settings = {}
+        ignored_keys = []
         for key, value in settings.items():
             if key not in valid_keys:
-                continue  # Skip unknown keys instead of rejecting
+                ignored_keys.append(key)
+                continue  # Skip unknown keys instead of rejecting
@@
         if save_diarization_settings(current_settings):
             logger.info(f"Updated and saved diarization settings: {filtered_settings}")
 
             return {
                 "message": "Diarization settings saved successfully",
                 "settings": current_settings,
+                "ignored_keys": ignored_keys,
                 "status": "success"
             }
extras/local-omi-bt/send_to_adv.py (3)

69-70: Bare except clause catches all exceptions.

This catches BaseException including SystemExit and KeyboardInterrupt. Catch the specific exception type instead.

♻️ Proposed fix
             try:
                 error_data = response.json()
                 error_msg = error_data.get('detail', error_msg)
-            except:
+            except (json.JSONDecodeError, KeyError):
                 pass

141-141: Token in URL query parameter may appear in server logs.

While passing JWT tokens via query parameters is common for WebSocket authentication (since WebSocket doesn't support custom headers during handshake), be aware that the token may be logged by proxies or the server's access logs.

Consider documenting this trade-off or ensuring server-side logging sanitizes query parameters.


206-221: Consider waiting for server acknowledgment after audio-stop.

After sending audio-stop, the connection closes immediately via the finally block. The server may not have finished processing the final audio chunks. Consider adding a brief wait or listening for a completion acknowledgment before cleanup.

♻️ Suggested approach
             await websocket.send(json.dumps(audio_stop) + '\n')
             logger.info(f"Sent audio-stop event. Total chunks: {chunk_count}")

+            # Give server time to process final chunks before closing
+            await asyncio.sleep(1.0)
+
         finally:
             # Clean up receive task
             receive_task.cancel()

Alternatively, if the backend sends a completion message, wait for that instead of a fixed sleep.

backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (1)

281-297: Keep success payload keys consistent on skip path.

The skip-success response omits segment_count, which is present in the normal success payload. If any downstream expects it, this can trigger KeyErrors. Consider returning a default segment_count: 0 (and other expected keys) to preserve the response schema.

🛠️ Suggested patch
                 return {
                     "success": True,
                     "conversation_id": conversation_id,
                     "version_id": version_id,
                     "speaker_recognition_enabled": True,
                     "speaker_service_unavailable": True,
                     "identified_speakers": [],
+                    "segment_count": 0,
                     "skip_reason": f"Speaker service unavailable: {error_type}",
                     "processing_time_seconds": time.time() - start_time
                 }
backends/advanced/scripts/create_plugin.py (1)

31-34: Optional: Remove extraneous f prefixes from strings without placeholders.

Several print statements use f-strings but contain no placeholders. While harmless, removing the f prefix improves clarity.

♻️ Example fix
-        print(f"❌ Error: Plugin name must be alphanumeric with underscores")
+        print("❌ Error: Plugin name must be alphanumeric with underscores")
         print(f"   Got: {plugin_name}")
-        print(f"   Example: my_awesome_plugin")
+        print("   Example: my_awesome_plugin")

Apply similar changes to lines 369-394 where strings have no placeholders.

Also applies to: 369-394

backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/email_service.py (1)

120-122: Consider using SMTP_SSL for port 465.

The comment mentions port 465 SSL, but smtplib.SMTP is used instead of smtplib.SMTP_SSL. If users configure port 465 with smtp_use_tls=False, the connection may fail since port 465 typically requires implicit SSL from the start.

♻️ Proposed fix to handle implicit SSL
         else:
-            # Direct connection (for port 465 SSL or no encryption)
-            smtp_server = smtplib.SMTP(self.host, self.port)
+            # Direct connection - use SMTP_SSL for port 465, plain SMTP otherwise
+            if self.port == 465:
+                smtp_server = smtplib.SMTP_SSL(self.host, self.port, timeout=30)
+            else:
+                smtp_server = smtplib.SMTP(self.host, self.port, timeout=30)
backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/plugin.py (1)

47-47: Make SUPPORTED_ACCESS_LEVELS immutable and a ClassVar.

A mutable list at class scope can be changed across instances; using ClassVar + tuple prevents accidental mutation and aligns with typing expectations.

♻️ Proposed change
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, ClassVar, Tuple
@@
-    SUPPORTED_ACCESS_LEVELS: List[str] = ['conversation']
+    SUPPORTED_ACCESS_LEVELS: ClassVar[Tuple[str, ...]] = ("conversation",)
wizard.py (1)

288-309: Run plugin setup from the plugin directory to avoid relative-path breakage.

If a plugin’s setup.py relies on relative paths (templates/configs), running it from repo root can fail. Consider setting cwd to the plugin directory and pass an absolute requirements path. Please verify existing plugin setup scripts to confirm assumptions.

♻️ Proposed change
-        result = subprocess.run(
-            ['uv', 'run', '--with-requirements', 'setup-requirements.txt', 'python', str(setup_path)],
-            cwd=str(Path.cwd())
-        )
+        repo_root = Path.cwd()
+        requirements = repo_root / "setup-requirements.txt"
+        result = subprocess.run(
+            ['uv', 'run', '--with-requirements', str(requirements), 'python', str(setup_path)],
+            cwd=str(plugin_info['dir'])
+        )

…rity

- Updated `mask_value` function to handle whitespace more effectively.
- Enhanced `create_plugin` to remove existing directories when using the `--force` option.
- Changed logging level from error to debug for existing admin user checks.
- Improved client ID generation logging for clarity.
- Removed unused fields from conversation creation.
- Added HTML escaping in email templates to prevent XSS attacks.
- Updated audio file download function to include user ID for better tracking.
- Adjusted WebSocket connection settings to respect SSL verification based on environment variables.
- Removed `auto_generate_client` and `folder` parameters from audio upload functions to streamline the API.
- Updated related function calls and documentation to reflect these changes, enhancing clarity and reducing complexity.
…nd security

- Removed outdated migration instructions from `plugin-configuration.md` to streamline documentation.
- Enhanced `README.md` to clearly outline the three-file separation for plugin configuration, emphasizing the roles of `.env`, `config.yml`, and `plugins.yml`.
- Updated `setup.py` to reflect changes in orchestration settings, ensuring only relevant configurations are included in `config/plugins.yml`.
- Improved security messaging to highlight the importance of not committing secrets to version control.
@AnkushMalaker AnkushMalaker marked this pull request as ready for review January 18, 2026 20:39
…t variable syntax for improved flexibility and security. This change standardizes the way API keys are referenced across different models and services. (#273)

Co-authored-by: roshan.john <[email protected]>
@AnkushMalaker AnkushMalaker requested a review from 0xrushi January 18, 2026 20:41
- Replaced total job count with separate counters for successful and failed jobs during Redis queue cleanup.
- Enhanced logging to provide detailed feedback on the number of jobs cleared and any failures encountered.
- Improved error handling to ensure job counts are accurately reflected even when exceptions occur.
…ed container status check

- Replaced 'docker logs' commands with 'docker compose -f docker-compose-test.yml logs' for consistency across workflows.
- Added a check for running containers before saving logs to enhance debugging capabilities.
@0xrushi
Copy link
Contributor

0xrushi commented Jan 18, 2026

@coderabbitai Check for imports placed mid-file and enforce that all imports must be declared at the top of the file

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 18, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

I'll check all Python files in this PR for imports placed mid-file.


🏁 Script executed:

#!/bin/bash
# Get all Python files changed in this PR
git diff --name-only origin/feature/annotations...HEAD | grep '\.py$' > /tmp/changed_python_files.txt

# Create a Python script to check import placement
cat > /tmp/check_imports.py << 'SCRIPT'
import ast
import sys
from pathlib import Path

def check_imports(filepath):
    """Check if imports are at the top of the file."""
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            content = f.read()
        
        if not content.strip():
            return []
        
        tree = ast.parse(content, filename=str(filepath))
        
        violations = []
        first_non_import_line = None
        
        for node in tree.body:
            line_no = node.lineno
            
            # Skip module docstrings at the beginning
            if isinstance(node, ast.Expr) and isinstance(node.value, (ast.Str, ast.Constant)):
                if first_non_import_line is None:
                    continue
            
            # Check for imports
            if isinstance(node, (ast.Import, ast.ImportFrom)):
                if first_non_import_line is not None:
                    # This import comes after non-import code
                    import_names = []
                    if isinstance(node, ast.Import):
                        import_names = [alias.name for alias in node.names]
                    else:
                        module = node.module or ''
                        import_names = [f"{module}.{alias.name}" for alias in node.names]
                    
                    violations.append({
                        'line': line_no,
                        'imports': import_names,
                        'first_code_line': first_non_import_line
                    })
            else:
                # This is non-import code
                if first_non_import_line is None:
                    first_non_import_line = line_no
        
        return violations
    except SyntaxError as e:
        return [{'error': f'Syntax error: {e}'}]
    except Exception as e:
        return [{'error': f'Error parsing file: {e}'}]

if __name__ == '__main__':
    files = []
    with open('/tmp/changed_python_files.txt', 'r') as f:
        files = [line.strip() for line in f if line.strip()]
    
    found_violations = False
    
    for filepath in files:
        path = Path(filepath)
        if not path.exists():
            continue
            
        violations = check_imports(filepath)
        
        if violations:
            found_violations = True
            print(f"\n📁 {filepath}")
            for v in violations:
                if 'error' in v:
                    print(f"  ❌ {v['error']}")
                else:
                    print(f"  ⚠️  Line {v['line']}: Import(s) {', '.join(v['imports'])} found after code at line {v['first_code_line']}")
    
    if not found_violations:
        print("✅ No mid-file imports found. All imports are at the top of their files.")
    
    sys.exit(0)
SCRIPT

# Run the check
python /tmp/check_imports.py

Length of output: 285


Let me try a different approach by checking Python files directly in the repository:


🏁 Script executed:

#!/bin/bash
# Find all Python files in the backends/advanced directory (where most changes are)
fd -e py . backends/advanced/src backends/advanced/scripts backends/advanced/tests extras setup_utils.py wizard.py backends/advanced/init.py 2>/dev/null > /tmp/python_files.txt || true

# Create Python script to check import placement
cat > /tmp/check_imports.py << 'SCRIPT'
import ast
import sys
from pathlib import Path

def check_imports(filepath):
    """Check if imports are at the top of the file."""
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            content = f.read()
        
        if not content.strip():
            return []
        
        tree = ast.parse(content, filename=str(filepath))
        
        violations = []
        first_non_import_line = None
        
        for node in tree.body:
            line_no = node.lineno
            
            # Skip module docstrings and __future__ imports at the beginning
            if isinstance(node, ast.Expr) and isinstance(node.value, (ast.Str, ast.Constant)):
                if first_non_import_line is None:
                    continue
            
            # Check for imports
            if isinstance(node, (ast.Import, ast.ImportFrom)):
                if first_non_import_line is not None:
                    # This import comes after non-import code
                    import_names = []
                    if isinstance(node, ast.Import):
                        import_names = [alias.name for alias in node.names]
                    else:
                        module = node.module or ''
                        import_names = [f"from {module} import {', '.join(alias.name for alias in node.names)}"]
                    
                    violations.append({
                        'line': line_no,
                        'imports': import_names,
                        'first_code_line': first_non_import_line
                    })
            else:
                # This is non-import code (excluding module docstrings)
                if first_non_import_line is None:
                    first_non_import_line = line_no
        
        return violations
    except SyntaxError as e:
        return [{'error': f'Syntax error: {e}'}]
    except Exception as e:
        return [{'error': f'Error parsing file: {e}'}]

if __name__ == '__main__':
    files = []
    try:
        with open('/tmp/python_files.txt', 'r') as f:
            files = [line.strip() for line in f if line.strip()]
    except FileNotFoundError:
        print("No Python files found to check.")
        sys.exit(0)
    
    found_violations = False
    
    for filepath in files:
        path = Path(filepath)
        if not path.exists():
            continue
            
        violations = check_imports(filepath)
        
        if violations:
            found_violations = True
            print(f"\n📁 {filepath}")
            for v in violations:
                if 'error' in v:
                    print(f"  ❌ {v['error']}")
                else:
                    print(f"  ⚠️  Line {v['line']}: Import found after code (first code at line {v['first_code_line']})")
                    for imp in v['imports']:
                        print(f"      {imp}")
    
    if not found_violations:
        print("✅ No mid-file imports found. All imports are properly placed at the top of their files.")
    else:
        print("\n" + "="*80)
        print("Summary: Found mid-file imports that should be moved to the top of the file.")
        print("="*80)
    
    sys.exit(0)
SCRIPT

# Run the check
python /tmp/check_imports.py

Length of output: 1857


Found 7 files with mid-file imports that violate the import placement standard. Let me examine a few key examples to understand the context:


🏁 Script executed:

#!/bin/bash
# Check context around the mid-file imports in key files

echo "=== email_summarizer/setup.py (lines 20-30) ==="
sed -n '20,30p' backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/setup.py

echo -e "\n=== services/memory/__init__.py (lines 19-27) ==="
sed -n '19,27p' backends/advanced/src/advanced_omi_backend/services/memory/__init__.py

echo -e "\n=== llm_providers.py (lines 31-40) ==="
sed -n '31,40p' backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py

echo -e "\n=== connect-omi.py (lines 15-30) ==="
sed -n '15,30p' extras/local-omi-bt/connect-omi.py

echo -e "\n=== service.py (lines 195-203) ==="
sed -n '195,203p' extras/speaker-recognition/src/simple_speaker_recognition/api/service.py

Length of output: 1448


@0xrushi I found 7 files with mid-file imports. Here's the breakdown:

Issues Found

🔴 Should be fixed (imports after logger setup):

  1. backends/advanced/src/advanced_omi_backend/services/memory/__init__.py (Line 25)

    • Import placed after logger initialization
    • Can be moved to the top
  2. backends/advanced/src/advanced_omi_backend/services/memory/providers/llm_providers.py (Line 37)

    • Import placed after logger initialization
    • Can be moved to the top
  3. backends/advanced/tests/test_obsidian_service.py (Line 9)

    • Import after pytest skip decorator
    • Should be at the top

🟡 Possibly intentional (path manipulation):

  1. backends/advanced/src/advanced_omi_backend/plugins/email_summarizer/setup.py (Line 26)

    • Import after sys.path.insert(0, str(project_root))
    • This pattern is used to add the repo root to Python path before importing setup_utils
    • Common for standalone scripts, but could be refactored
  2. extras/local-omi-bt/connect-omi.py (Line 28)

    • Import after sys.path.append(os.path.dirname(__file__))
    • Similar path manipulation before import

🟡 May have circular import reasons:

  1. extras/speaker-recognition/src/simple_speaker_recognition/__init__.py (Lines 24-25)

    • Imports after version/metadata setup
    • Could indicate circular import avoidance
  2. extras/speaker-recognition/src/simple_speaker_recognition/api/service.py (Line 199)

    • Routers imported after FastAPI app creation
    • Likely to avoid circular imports or ensure app initialization order

Recommendation

The files in the backends/advanced/src/advanced_omi_backend/services/memory/ directory and the test file should definitely be refactored to move imports to the top. The others may need investigation to determine if the mid-file placement is intentional for technical reasons (circular imports, path setup for scripts).

Would you like me to create an issue to track fixing these import placement violations?

…amp adjustments

- Added `audio_offset_seconds` to track cumulative audio duration for accurate timestamp adjustments across transcription sessions.
- Updated `store_final_result` method to adjust word and segment timestamps based on cumulative audio offset.
- Improved logging to reflect changes in audio offset after storing results.
- Modified Makefile and documentation to clarify test execution options, including new tags for slow and SDK tests, enhancing test organization and execution clarity.
…n tests

- Set `COMPOSE_PROJECT_NAME` for test containers to ensure consistent naming.
- Consolidated error messages in the `websocket_transcription_e2e_test.robot` file for clarity, improving readability and debugging.
- Added timeout handling for WebSocket closure in `AudioStreamClient` to prevent hanging and ensure clean disconnection.
- Updated integration tests to log the total chunks sent when closing audio streams, improving clarity on resource management during test teardown.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants