Suppose you have constructed a biolink-compliant knowledge graph, and want to deploy it as a TRAPI endpoint with limited fuss. Plater is a web server that automatically exposes a Neo4j instance through TRAPI compliant endpoints. Plater brings several tools together in a web server to achieve this. It Uses Reasoner Pydantic models for frontend validation and Reasoner transpiler for transforming TRAPI to and from cypher and querying the Neo4j backend. The Neo4j database can be populated by using KGX upload, which is able to consume numerous graph input formats. By pointing Plater to Neo4j we can easily stand up a Knowledge Provider that provides the “lookup” operation and meta_knowledge_graph, as well as providing a platform to distribute common code implementing future operations across any endpoint built using Plater. In addition, with some configuration (x-trapi parameters etc...) options we can easily register our new instance to Smart api.
Another tool that comes in handy with Plater is Automat, which helps expose multiple Plater servers at a single public url and proxies queries towards them. Here is an example of running Automat instance.
Nodes are expected to have the following core structure:
- id : as neo4j node property with label
id
- category : Array of biolink types as neo4j node labels, it is required for every node to have at least the node label "biolink:NamedThing".
- Additional attributes can be added and will be exposed. (more details on "Matching a TRAPI query" section)
Edges need to have the following properties structure:
- subject: as a neo4j edge property with label
subject
- object: as neo4j edge property with label
object
- predicate: as neo4j edge type
- id: as neo4j edge property with label
id
- Additional attributes will be returned in the TRAPI response attributes section. (more details on "Matching a TRAPI query" section)
PLATER matches nodes in neo4j using node labels. It expects nodes in neo4j to be labeled using biolink types. Nodes in neo4j can have multiple labels. When looking a node from an incoming TRAPI query graph, the node type(s) are extracted for a node, and by traversing the biolink model, all subtypes and mixins that go with the query node type(s) will be used to lookup nodes.
It's recommended that when encoding nodes labels in neo4j that we use the biolink class genealogy. For instance a node that is known to be a biolink:SmallMolecule
can be assigned all of these classes ["biolink:SmallMolecule", "biolink:MolecularEntity", "biolink:ChemicalEntity", "biolink:PhysicalEssence", "biolink:NamedThing", "biolink:Entity", "biolink:PhysicalEssenceOrOccurrent"]
.
By doing such encoding, during lookup the incoming query is can be more laxed (ask for biolink:NamedThing
) or more specific (ask for biolink:SmallMolecule
etc...), and PLATER would be able to use the encoded label information to find matching node(s).
Similarly, for edges, edge labels in neo4j are used to perform edge lookup. Predicate hierarchy in biolink would be consulted to find subclasses of the query predicate type(s) and those would be used in an OR
combinatorial fashion to find results.
Plater does subclass inference if subclass edges are encoded into neo4j graph. For eg , let A be a super class of B and C. And let B, C are related to D and E respectively :
(A) <- biolink:subclass_of - (B) - biolink:decreases_activity_of -> (D)
<- biolink:subclass_of - (C) - biolink:decreases_activity_of -> (E)
Querying for A - [ biolink:decreases_activity_of] -> (?)
graph structure in TRAPI would give us back nodes D and E.
Plater tries to resolve attibute types and value types for edges and nodes in the following ways.
-
attr_val_map.json: This file has the following structure
{ "attribute_type_map" : { "<attribute_name_in_neo4j>" : "TRAPI_COMPLIANT_ATTRIBUTE_NAME" }, "value_type_map": { "<attribute_name_in_neo4j>" : "TRAPI_COMPLIANT_VALUE_TYPE" } }
To explain this a little further, suppose we have an attribute called "equivalent_identifiers" stored in neo4j. Our attr_val_map.json would be :
{ "attribute_type_map": { "equivalent_identifiers": "biolink:same_as" }, "value_type_map": { "equivalent_identifiers": "metatype:uriorcurie" } }
When Nodes / edges that have equvalent_identifier are returned they would have :
"MONDO:0004969": { "categories": [...], "name": "acute quadriplegic myopathy", "attributes": [ { "attribute_type_id": "biolink:same_as", "value": [ "MONDO:0004969" ], "value_type_id": "metatype:uriorcurie", "original_attribute_name": "equivalent_identifiers", "value_url": null, "attribute_source": null, "description": null, "attributes": null }] }
-
In cases where there are attributes in neo4j that are not specified in attr_val_map.json, PLATER will try to resolve a biolink class by using the original attribute name using Biolink model toolkit.
-
If the above steps fail the attribute will be presented having
"attribute_type_id": "biolink:Attribute"
and"value_type_id": "EDAM:data_0006"
-
If there are attributes that is not needed for presentation through TRAPI Skip_attr.json can be used to specify attribute names in neo4j to skip. KGX loading adds a new attributes
provided_by
andknowledge_source
to nodes and edges respectively, which are the file name used to load the graph. By default, we have included these to the skip list.
By setting PROVENANCE_TAG
environment variable to something like infores:automat.ctd
, PLATER will return provenance information on edges.
To run the web server directly:
cd <PLATER-ROOT>
python<version> -m venv venv
source venv/bin/activate
pip install -r PLATER/requirements.txt
Populate .env-template
file with settings and save as .env
in repo root dir.
WEB_HOST=0.0.0.0
WEB_PORT=8080
NEO4J_HOST=neo4j
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=<change_me>
NEO4J_HTTP_PORT=7474
NEO4J_QUERY_TIMEOUT=600
PLATER_TITLE='Plater'
PLATER_VERSION='1.5.1'
BL_VERSION='4.1.6'
./main.sh
Or build an image and run it.
cd PLATER
docker build --tag <image_tag> .
cd ../
docker run --env-file .env\
--name plater\
-p 8080:8080\
--network <network_where_neo4j_is_running>\
plater-tst
Clustering with Automat Server [Optional]
You can also serve several instances of plater through a common gateway(Automat). On specific instructions please refer to AUTOMAT's readme
The /about
endpoint can be used to present meta-data about the current PLATER instance.
This meta-data is served from <repo-root>/PLATER/about.json
file. One can edit the contents of
this file to suite needs. In containerized environment we recommend mounting this file as a volume.
Eg:
docker run -p 0.0.0.0:8999:8080 \
--env NEO4J_HOST=<your_neo_host> \
--env NEO4J_HTTP_PORT=<your_neo4j_http_port> \
--env NEO4J_USERNAME=neo4j\
--env NEO4J_PASSWORD=<neo4j_password> \
--env WEB_HOST=0.0.0.0 \
-v <your-custom-about>:/<path-to-plater-repo-home>/plater/about.json \
--network=<docker_network_neo4j_is_running_at> \
<image_tag>