Skip to content

Semantic Layer Roadmap #35

@cjimti

Description

@cjimti

Semantic Layer Roadmap

Tracking issue for semantic layer and MCP Apps enhancements.

Audience

Primary: Management and Executives

  • Furthest from raw data, highest need for self-service
  • Don't know (and shouldn't need to know) SQL, tables, columns, lineage
  • Need accurate, insightful, visually compelling answers to business questions

Secondary: Data Analysts and Technicians

  • Benefit from the same tools with ability to drill into technical details
  • Technical metadata builds credibility ("this isn't magic")

Design Principles

  1. Insights over mechanics - Lead with the answer/visualization, technical details in the background
  2. Platform owns presentation - Toolkits (mcp-trino, mcp-datahub, mcp-s3) are building blocks; mcp-data-platform crafts the user experience
  3. MCP Apps live here - Apps need cross-injection, semantic enrichment, unified design language
  4. Enrichment over tools - Prefer middleware that enhances existing tool responses over adding new tools
  5. Credibility through transparency - Subtle technical details (query time, row count) build trust without overwhelming

Architecture

graph TB
    subgraph Platform["mcp-data-platform"]
        subgraph Apps["MCP Apps Layer"]
            QR[Query Results]
            LV[Lineage Viz]
            CB[Catalog Browser]
        end
        
        subgraph Enrichment["Semantic Enrichment"]
            CI[Cross-Injection]
            BC[Business Context]
            DQ[Data Quality]
        end
        
        subgraph Toolkits["Composed Toolkits"]
            Trino[mcp-trino]
            DataHub[mcp-datahub]
            S3[mcp-s3]
        end
    end
    
    User([Executive / Analyst]) --> Apps
    Apps --> Enrichment
    Enrichment --> Toolkits
    Toolkits --> Data[(Data Sources)]
    
    style Apps fill:#3b82f6,color:#fff
    style Enrichment fill:#8b5cf6,color:#fff
    style Toolkits fill:#6b7280,color:#fff
Loading

Toolkits expose data. Platform presents insights.

Current State

  • ✅ Semantic enrichment middleware (adds DataHub context to Trino responses)
  • ✅ Cross-injection pattern (Trino ↔ DataHub bidirectional context)
  • ✅ Query Results MCP App
    • Auto-chart for numeric data (insight first)
    • Interactive table with sort/filter/search
    • Smart titles ("Revenue by Product" not "column1 by column2")
    • Subtle footer stats (credibility without noise)
    • CSV export

Potential Enhancements (prioritize as needed)

MCP Apps

  • Catalog Browser - Searchable data catalog for discovery ("what data do we have about customers?")
  • Lineage Visualization - Interactive graph showing data flow
  • Quality Dashboard - Data freshness, quality scores at a glance
  • Schema Explorer - Visual schema with business descriptions

Middleware Enrichment

  • Include data quality scores in responses
  • Add deprecation warnings prominently
  • Surface freshness info (last updated)
  • Inline glossary term definitions

Query Results App Improvements

  • Multiple chart support (compare metrics)
  • Sparklines for trend data
  • Conditional formatting (highlight outliers)
  • Saved views / bookmarks

Dependencies

Non-Goals

  • Adding many new tools (tool sprawl)
  • Exposing technical metadata to end users
  • Duplicating DataHub/Trino functionality

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions