Skip to content

Translator Curated Query Service

karafecho edited this page Sep 11, 2024 · 9 revisions

Description

The CQS was conceptualized by the Translator Clinical Data Committee, but initial development and implementation were conducted under the Standards and Reference Implementation (SRI) component of Translator. The CQS provides a simple mechanism through which KP teams or any committee, working group, or external team can apply their expertise /resources to specify how their data are to be used for inference. Thus, the CQS enables a ”conservative ingest” paradigm, where KP teams directly ingest knowledge sources and perhaps compute on them, but rely on the CQS service to generate desired inferences based on this more foundational knowledge. For instance, the CQS templates are used by the CQS to generate "treats" predictions based on a set of rules developed by the contributing KPs who expose the primary knowledge source (e.g., Clinical Trials provider exposes one-hop in_clinical_trials_for edges and directs the CQS to generate a predicted "treats" edge when a clinical trial meets certain criteria such as in phase 3 or 4).

What It Does

  1. An SRI Service that provides ARA-like capabilities:

    • generation of ‘predicted’ edges in response to creative queries - based on customizable inference rules

    • linking predictions to their supporting aux graphs

    • attachment of provenance metadata and scores to result

  2. Inference specifications are defined as TRAPI templates, which serve as config files for a custom reasoning service / workflow

    • The specifications include a required field to primary and aggregator knowledge sources (e.g., "resource_id": "infores:biothings-explorer", "resource_role": "primary_knowledge_source") and optional fields to specify, for example, workflow parameters such as an "allow list"
  3. Scoring of individual workflow templates can be customized

What it Enables

  1. Supports manually-defined, SMuRF- and SME-evaluated inferred workflows to be contributed by any team or working group, or even external groups; each workflow is structured as a valid TRAPI query and serves as a CQS template

  2. Provides a simple mechanism through which KPs can apply their expertise /resources to specify how their data are to be used for inference

    • This can enable a ”conservative ingest” paradigm - where KPs ingest what sources directly assert and rely on CQS services to generate desired inferences based on this more foundational knowledge
    • For example, the CQS is supporting Biolink Model's "treats" refactoring effort such that KP (e.g., Multiomics Clinical Trials KP) can report the precise entity relationships reported by a given source (e.g., clinicaltrials.gov, biolink:in_clinical_trials_for) and the CQS can then generate a "treats" edge based on a set of rules for when such edges can be elevated to "treats" status, as defined by the KP team, with the CQS pointing to the original edge as an aux graph; in this example, the CQS predicted "treats" edge refers to biolink:knowledge_level prediction, biolink:agent_type computational_model, and the primary edge from the KP refers to biolink:knowledge_level knowledge_assertion, biolink:agent_type manual_agent, biolink:max_research_phase clinical_trial_phase_4
  3. Allows KP teams such as OpenPredict or Multiomics to avoid dealing with ARA functions such as aux graphs, ARS registration, merging, scoring, normalizing, adding literature co-occurrence

  4. Facilitates consistent specification and implementation of inference rules, by providing a centralized and transparent place to define, align, and collaborate on inference rules

How to Submit a New Template and into the Translator Pipeline

  1. Develop a set of "rules" specifying when a particular KP can contribute to an inferred MVP query.
  2. Apply the rules in (1) via a valid TRAPI query that can serve as a CQS template.
    • Include required specifications such as a field specifying primary and aggregator knowledge sources (see example template).
    • Include an "id" field for n0 in the form of an empty array.
    • Include any additional specifications such as attribute constraints and workflow parameters such as an "allowlist".
  3. Test the CQS template by direct query of the Workflow Runner.
  4. Create a branch in the CQS repo.
    • Create a new template folder within CQS/templates. Following the nomenclature specified below.
    • Within that folder, add a thoroughly descriptive README with a POC and select CURIES to be used for development and testing. The CURIES should be associated with test assets that the POC has contributed to the test assets repo: https://github.com/NCATSTranslator/Tests.
    • Add a new CQS template structured as a valid TRAPI.
    • Create a PR.
  5. The new CQS template will then be deployed to DEV, thus entering the Translator pipeline.
  6. After the CQS is deployed to CI, it will be picked up by the Information Radiator for automated testing. The POC for a given CQS template is responsible for monitoring the testing results.

See https://github.com/NCATSTranslator/OperationsAndWorkflows/tree/main/schema for valid TRAPI operations and workflows.

The nomenclature for CQS templates is as follows:

Human-readable format: MVP# Template # (infores or otherwise short but descriptive name that captures the intent of the template) Example: MVP1 Template 3 (openpredict)

GitHub format: mvp#-template#-infores or mvp#-template#-descriptive-name

Note that MVP2 templates should be named as follows: MVP2-up-gene, MVP2-down-gene, MVP2-up-chemical, MVP2-down-chemical.

Example JSON query

Team contacts

Jason Reilly (Exposures Provider)

Kara Fecho (Exposures Provider, SRI)

Max Wang (Exposures Provider, Ranking Agent, SRI)

Source code

https://github.com/TranslatorSRI/CQS/

Clone this wiki locally