Skip to content

Open Health Data at Carolina

karafecho edited this page Oct 31, 2024 · 1 revision

URL: to be added

Description:

Open Health Data @ Carolina provides access to counts and frequencies (i.e., EHR prevalence) of conditions, procedures, drug exposures, and patient demographics, and the co-occurrence frequencies between them. Count and frequency data were derived from UNC Health's OMOP database on a five-year cohort (~6M patients over years 2018 through 2022) of all UNC Health patients, including their inpatient and outpatient visit data. Counts represent the number of patients associated with a given concept, e.g., diagnosed with a condition, exposed to a drug, or who had a procedure. Frequencies are the number of unique patients associated with the concept divided by the total number of patients in the dataset, i.e., prevalence in the electronic health records. To protect patient privacy, all concepts and pairs of concepts where the count <= 10 were excluded, and counts were randomized by the Poisson distribution.

The counts for each concept include the patients from all descendant concepts. For example, the count for ibuprofen (ID 1177480) includes patients with Ibuprofen 600 MG Oral Tablet (ID 19019073 patients), Ibuprofen 400 MG Oral Tablet (ID 19019072), Ibuprofen 20 MG/ML Oral Suspension (ID 19019050), etc. Clinical concepts (e.g., conditions, procedures, drugs) are coded by their standard concept ID in the OMOP Common Data Model.

Example edge (interpretation): to be added

Data source(s): Open Health Data @ Carolina exposes data from UNC Health's OMOP database on a five-year cohort (~6M patients over years 2018 through 2022) of all UNC Health patients, including their inpatient and outpatient visit data.

Key methodologic metrics: Open Health Data @ Carolina provides the following key metrics and their statistical measures of association captured inside of biolink:StudyResult structures:

  • Raw counts of each concept and concept pair co-occurrence - biolink:ConceptCountAnalysisResult
  • Chi-squared statistic (Bonferonni adjusted p-value) - biolink:ChiSquaredAnalysisResult
  • Relative frequency (99% confidence interval) - biolink:RelativeFrequencyAnalysisResult
  • Observed-expected frequency ratio (99% confidence interval) - biolink:ObservedExpectedFrequencyAnalysisResult
  • Odds ratio and log odds ratio (95% confidence interval)
  • Total sample size

Regulatory requirement(s) and/or licensing restriction(s): Service is compliant with all federal and institutional regulations.

Additional resources:

  • OHDSI
  • OMOP Common Data Model
  • Athena (OMOP vocabularies, search, concept relationships, concept hierarchy)
  • Atlas (OMOP vocabularies, search, concept relationships, concept hierarchy, concept sets)
Clone this wiki locally