Skip to content

Comments

fix: deterministic Trino multi-connection routing with mandatory selection#146

Merged
cjimti merged 2 commits intomainfrom
fix/trino-multi-connection-routing
Feb 22, 2026
Merged

fix: deterministic Trino multi-connection routing with mandatory selection#146
cjimti merged 2 commits intomainfrom
fix/trino-multi-connection-routing

Conversation

@cjimti
Copy link
Member

@cjimti cjimti commented Feb 22, 2026

Summary

Fixes non-deterministic Trino connection routing when multiple instances are configured (e.g., cassandra, elasticsearch, warehouse) and adds three follow-up improvements: connection descriptions, mandatory connection selection, and correct list_connections expansion.

Commit 1: Deterministic routing via multiserver.Manager

Previously, the platform created N separate single-client Toolkit instances that each registered the same tool names via RegisterTools(server). Due to the MCP SDK's last-write-wins semantics and Go map iteration order, one random toolkit "owned" all Trino tools per process start — causing queries intended for one catalog to hit a different one entirely.

Replaced N separate toolkits with a single toolkit backed by multiserver.Manager from mcp-trino. The manager handles connection routing internally based on the connection parameter in each tool call request.

Commit 2: Connection descriptions, mandatory selection, list_connections fix

Three design gaps remained after the routing fix:

  1. No connection descriptions — LLMs had no way to know what each connection is for. Added a description field to Trino instance config so operators can document each connection's purpose (e.g., "Elasticsearch for transactional sales data").

  2. Connection silently defaults — When multiple connections exist and none is specified, the query silently routed to the default. Now a ConnectionRequiredMiddleware rejects the call with an error listing all available connections and their descriptions, so the LLM can choose explicitly.

  3. list_connections only showed 1 entry — After the multi-connection change, the platform creates 1 Trino toolkit (not N), but the handler iterated toolkitRegistry.All() creating 1 entry per toolkit. Now toolkits implementing the ConnectionLister interface are expanded into individual entries with descriptions and default markers.

Changes

Registry & loader (pkg/registry/)

  • toolkit.go: Added AggregateToolkitFactory type for factories that receive all instance configs and produce a single toolkit
  • registry.go: Added aggregateFactories map, RegisterAggregateFactory(), GetAggregateFactory()
  • loader.go: Check for aggregate factories before per-instance creation. Extracted loadAggregate(), mergeInstanceConfigs(), loadKindFromMap(), mergeMapInstances()
  • factories.go: Added TrinoAggregateFactory using multiserver.Manager

Shared types (pkg/toolkit/)

  • connection.go: New dependency-free package with ConnectionDetail and ConnectionLister interface (avoids import cycle between registrytoolkits/trino)

Trino toolkit (pkg/toolkits/trino/)

  • config.go: Added MultiConfig, ParseMultiConfig(), parse description field
  • toolkit.go: Added NewMulti() with multiserver.Manager, ListConnections() implementing ConnectionLister, buildConnectionRequired() helper, connectionDescriptions map
  • connection_required.go: New ConnectionRequiredMiddleware — rejects tool calls with empty connection when multiple connections exist; error message lists all connections with descriptions sorted alphabetically. Uses reflection (FieldByName("Connection")) to extract the connection from any Trino tool input type

Audit accuracy (pkg/middleware/)

  • mcp.go: Added extractConnectionArg() to parse the connection field from tool call JSON arguments. Audit logs now reflect the actual target connection, not the toolkit default

Platform tools (pkg/platform/)

  • connections_tool.go: Added Description/IsDefault fields to connectionEntry. Handler checks for ConnectionLister interface to expand multi-connection toolkits into individual entries

Config (configs/platform.yaml)

  • Added example description fields to Trino instances

How it works

Before (broken):
  Config: 3 Trino instances → 3 Toolkit objects → 3x RegisterTools (same names)
  Result: Last writer wins (random per restart), all calls hit one random backend

After (fixed):
  Config: 3 Trino instances → 1 Toolkit with multiserver.Manager → 1x RegisterTools
  Result: Each tool call reads "connection" param → Manager.Client(name) → correct backend
  Missing "connection" with >1 instance → error with available connections and descriptions

Config format

toolkits:
  trino:
    enabled: true
    default: warehouse
    instances:
      warehouse:
        host: warehouse.example.com
        catalog: hive
        description: "Production data warehouse for batch analytics and reporting"
      elasticsearch:
        host: es.example.com
        catalog: elasticsearch
        description: "Elasticsearch for transactional sales data"
      cassandra:
        host: cass.example.com
        catalog: cassandra

Test plan

  • make verify passes (fmt, test, lint, security, coverage, dead-code, mutation, release-check)
  • All new functions >80% coverage — NewMulti 100%, ListConnections 100%, ConnectionRequiredMiddleware.Before 100%, extractConnectionFromInput 100%, buildConnectionRequired 100%, ParseConfig (description) 100%, handleListConnections 91.7%
  • Aggregate factory loader tests verify per-instance factory is NOT called when aggregate is registered
  • Connection override middleware tests verify pc.Connection uses request arg, not toolkit default
  • ConnectionRequiredMiddleware tests: passes with connection set, rejects without connection when >1 exist, skips list_connections tool, error includes all connection names and descriptions
  • ListConnections tests: single mode returns 1 entry, multi mode returns all with descriptions, satisfies ConnectionLister interface
  • handleListConnections tests: ConnectionLister toolkit expanded into multiple entries with description/is_default; non-lister toolkit falls through to legacy path
  • Manual verification with multi-instance config:
    • list_connections → all 3 Trino connections appear with descriptions
    • trino_query without connection → error lists available connections with descriptions
    • trino_query with connection: "warehouse" → routes correctly
    • Single-instance config → connection param remains optional

…ager

Replace N separate single-client Trino toolkits with a single toolkit
backed by multiserver.Manager, eliminating non-deterministic connection
routing caused by last-write-wins tool registration and Go map iteration.

- Add AggregateToolkitFactory type for toolkit kinds that handle
  multi-connection routing internally
- Add NewMulti constructor using mcp-trino's multiserver.Manager
- Switch RegisterBuiltinFactories to use TrinoAggregateFactory for trino
- Extract connection arg from tool call request for accurate audit logging
- Refactor MCPToolCallMiddleware and LoadFromMap to stay within complexity limits
@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

❌ Patch coverage is 96.40288% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.65%. Comparing base (c62d67f) to head (7e5fbee).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/middleware/mcp.go 84.00% 2 Missing and 2 partials ⚠️
pkg/toolkits/trino/toolkit.go 96.52% 2 Missing and 2 partials ⚠️
pkg/registry/factories.go 77.77% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #146      +/-   ##
==========================================
+ Coverage   90.47%   90.65%   +0.18%     
==========================================
  Files         112      113       +1     
  Lines       10800    11032     +232     
==========================================
+ Hits         9771    10001     +230     
- Misses        742      743       +1     
- Partials      287      288       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ons multi-connection support

When multiple Trino connections are configured, tool calls that omit
the connection parameter now return an error listing all available
connections with their descriptions, so the LLM can choose explicitly
instead of silently routing to the default.

The list_connections tool now expands multi-connection toolkits into
individual entries with descriptions and default markers via the new
ConnectionLister interface.

- Add Description field to Trino Config, parsed from platform YAML
- Add pkg/toolkit shared types package (ConnectionDetail, ConnectionLister)
- Implement ConnectionLister on Trino toolkit for single and multi mode
- Add ConnectionRequiredMiddleware rejecting empty connection when >1 exist
- Update list_connections handler to use ConnectionLister for expansion
- Add example description fields to configs/platform.yaml
@cjimti cjimti changed the title fix: deterministic Trino multi-connection routing fix: deterministic Trino multi-connection routing with mandatory selection Feb 22, 2026
@cjimti cjimti merged commit 4dded54 into main Feb 22, 2026
8 checks passed
@cjimti cjimti deleted the fix/trino-multi-connection-routing branch February 22, 2026 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant