Using incompatible synonym sets #62

patrickkwang · 2021-08-21T17:43:35Z

We have a shared service that provides CURIE synonyms/equivalences: https://nodenormalization-sri-dev.renci.org/1.1/docs

However, we do not insist that all Translator components use it. There are at least a few components that use synonym sets that are inconsistent, in places, with those provided by the SRI normalizer.

This results in one common but minor issue where an ARA is unable to verify the results it receives from a KP, if the two components use different synonym sets. If the ARA simply trusts the KP, this could result in some odd knowledge states wherein entities are conflated or de-conflated(?) in unexpected ways. That's a bit troubling, but not a deal-breaker.

Recently, however, we've run into a more jarring issue arising from inconsistent synonymization. For computational convenience, TRAPI allows batching queries, typically by providing a list of CURIEs in the "ids" field of a query-graph node. When an ARA sends a batched query to a KP, it must afterward un-batch the response by identifying which results correspond to which sub-queries. This may be impossible if the KP and ARA use different synonymization schemes.

Example:

ARA asks for genes associated with CHEBI:24996 or CHEBI:6801 (a batch query with two sub-queries)
KP maps CHEBI:24996 to CHEMBL.COMPOUND:CHEMBL330546 and returns results including the latter CURIE
ARA tries to map CHEMBL.COMPOUND:CHEMBL330546 to one of the two input CURIEs and fails - it knows what CHEMBL.COMPOUND:CHEMBL330546 is, and just doesn't believe it to be synonymous with CHEBI:24996
ARA has no choice but to drop all of the CHEMBL.COMPOUND:CHEMBL330546 results

In this case, potential results were lost because the ARA and KP did not agree on synonym sets.

cbizon · 2021-08-23T13:44:19Z

After some discussion with Patrick, it seems like there are three independent ways to fix this:

require that KP responses don't change the input curie
require that all KPs use the same normalization scheme
Modify how TRAPI handles batch requests, e.g. move from a single message with a list of curies to a list of messages each with a single curie.

patrickkwang added the bug Something isn't working label Aug 21, 2021

patrickkwang mentioned this issue Sep 9, 2021

Unambiguous (un)batching NCATSTranslator/ReasonerAPI#293

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using incompatible synonym sets #62

Using incompatible synonym sets #62

patrickkwang commented Aug 21, 2021

cbizon commented Aug 23, 2021

Using incompatible synonym sets #62

Using incompatible synonym sets #62

Comments

patrickkwang commented Aug 21, 2021

cbizon commented Aug 23, 2021