A comprehensive healthcare data platform combining enterprise data management with analytical capabilities, designed to support population health initiatives for medium to large healthcare organizations.
This project implements a hybrid data architecture that merges the reliability of Inmon's Enterprise Data Warehouse (EDW) approach with the analytical power of Kimball's dimensional modeling. The foundation supports both operational healthcare data management and sophisticated population health analytics.
The platform enables healthcare organizations to:
- Track and analyze patient health outcomes across populations
- Monitor and close care gaps systematically
- Calculate and report electronic Clinical Quality Measures (eCQMs)
- Support value-based care initiatives
- Enable risk stratification and patient cohort analysis
- Maintain comprehensive patient and provider histories
- Generate regulatory and quality improvement reports
Our architecture follows a two-phase approach that separates operational data storage from analytical processing:
The EDW serves as the system of record, implemented in Third Normal Form (3NF) for data integrity and operational efficiency. Key components include:
-
Core Entities
- Patient demographics and history
- Provider and organization hierarchies
- Address and location management
-
Clinical Data
- Encounters and visits
- Diagnoses (ICD-10, SNOMED)
- Procedures (CPT, HCPCS)
- Observations and lab results (LOINC)
- Medications and prescriptions (RxNorm, NDC)
-
Supporting Data
- Insurance and coverage information
- Care gap tracking
- Social Determinants of Health (SDOH)
- Patient attribution and program enrollment
The dimensional model optimizes for analytical queries and reporting:
-
Fact Tables
- Encounters (visits and admissions)
- Diagnoses (with support for both acute and chronic conditions)
- Procedures performed
- Medication orders
- Clinical observations
- Care gaps
- Quality measure results
-
Dimension Tables
- Date (with fiscal period support)
- Patient (Type 2 SCD)
- Provider (Type 2 SCD)
- Organization (Type 2 SCD)
- Condition (ICD-10/SNOMED)
- Procedure (CPT/HCPCS)
- Medication
- Quality Measures
- PostgreSQL 12+ (primary implementation)
- Minimum storage allocation for 1.5M patient records
- Support for concurrent analytical queries
- Partitioning capability for large fact tables
The system includes a robust ETL framework that:
- Refreshes dimensional data using SCD Type 2 for tracking historical changes
- Supports both full and incremental loading patterns
- Implements slowly changing dimension (SCD) management
- Maintains data lineage and audit trails
- Executes 8 times daily for near-real-time analytics
- Implemented table partitioning for large fact tables
- Designed efficient indexing strategies
- Optimized SCD Type 2 processing for dimension updates
- Supports both full refresh and incremental loading patterns
- Includes query optimization for common analytical patterns
- Role-based access control (RBAC)
- PHI encryption at rest
- Audit logging of data access and changes
- HIPAA-compliant data handling procedures
-
Database Setup
-- Create schemas CREATE SCHEMA phm_edw; CREATE SCHEMA phm_star;
-
Create EDW Tables
- Execute
phm-edw-ddl.sql
to create the 3NF structure - Review and configure security settings
- Execute
-
Create Star Schema
- Execute
phm-kimbal-ddl.sql
to create dimensional tables - Configure table partitioning if needed
- Execute
-
Initialize ETL Process
- Configure ETL parameters in
ETL_Refresh_Full.sql
- Set up scheduling for 8x daily refresh
- Test incremental load patterns
- Configure ETL parameters in
- Monitor ETL execution logs
- Review table statistics and update as needed
- Manage table partitions
- Archive historical data as needed
- Track ETL execution times
- Monitor fact table growth
- Analyze query performance patterns
- Review and update statistics regularly
We welcome contributions to improve the PHM Database Foundation. Please:
- Fork the repository
- Create a feature branch
- Submit a pull request with detailed description
- Ensure all existing tests pass
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
For detailed technical documentation, please refer to: