Unambiguous (un)batching #293

patrickkwang · 2021-09-09T16:28:43Z

NCATSTranslator/TranslatorArchitecture#62

We would like to make sure that results can be unambiguously matched to specific query-graph CURIEs/predicates.

patrickkwang · 2021-09-09T16:51:31Z

Idea: do more explicit batching. Do not allow lists of CURIEs/predicates, but allow sending multiple messages per payload.

Pros:

unambiguous linking of results to CURIEs/predicates
allows batching of differently-structured messages, e.g. one-hops with two-hops

Cons:

potentially lots of duplication, big payloads

edeutsch · 2021-09-16T07:57:07Z

I think an NodeBinding.original_id could solve the problem reasonably well.
So if the client's original query had qnode n00 with ids ['MONDO:123', 'MONDO:456']
but the server didn't have information on those exactly, but it had information on descendants, it could put descendants MONDO:9876 and MONDO:8765 in the KG and when it is binding them to n00, it would set NodeBinding.original_id='MONDO:123' to let the client know that although the CURIE is different, it is binding MONDO:9876 to what the client originally called MONDO:123.

edeutsch · 2021-09-16T17:32:46Z

It seems like there are 3 options:

Server always uses the client's CURIEs
Use the NodeBinding.original_id as described above
Servers should return node normalized CURIEs and client should use Node Normalize to associate with CURIE they asked about

1 and 3 does not solve the problem of subclass inference
Maybe 2 seems very heavy handed to be required in all cases

patrickkwang · 2021-09-16T17:34:10Z

A slight variation on idea 2: rather than an additional field, use a sort of extended qnode id in bindings. For example,

"node_bindings": {
    "n00.MONDO:123": [{"id": "MONDO:9876"}],
...

marcdubybroad · 2021-09-16T17:38:31Z

What about

"node_bindings": { "n00": {"id": {"MONDO:123": "MONDO:9876"}}, ...

vdancik · 2021-09-16T17:41:41Z

Should we use query_id and knowledge_graph_id to make it explicit which id we refer to?

brettasmi · 2021-09-16T17:57:43Z

FYI from the Architecture README:

KPs that implement the Translator Reasoner API must perform the following kinds of reasoning in answering queries:

Making identifiers more specific, e.g. responding to a query involving an entity with information related to a subclass of that entity. In the knowledge_graph portion of the response, the more-specific identifier must be present and linked to the less-specific identifier. In the results portion of the response, the more-specific response node will be bound to the less-specific query node.

edeutsch · 2021-09-16T18:03:56Z

We acknowledge that the above strategy from Architecture can be followed and interpreted successfully for unbatched queries (with a single id). BUT, for batched queries (which seem to be the most popular these days), following the Architecture guideline leads to a situation where the client would have a very hard time (unless they also do some complex ancestor matching) to interpret the results. Seems like NodeBinding.query_id would be the good way to solve that. But that won't appear until TRAPI 1.3. It remains to be seen when we would be able to release TRAPI 1.3 with that fix. We will decide after the relay.

edeutsch · 2021-11-04T17:07:55Z

Current design:

        "node_bindings": {
          "n00": [
            {
              "id": "CHEMBL.COMPOUND:CHEMBL112"
            }
          ],
          "n01": [
            {
              "id": "UniProtKB:P05181"
            }
          ]
        },

First proposal:

        "node_bindings": {
          "n00": [
            {
              "id": "CHEMBL.COMPOUND:CHEMBL112",
              "original_id": "CHEMBL.COMPOUND:CHEMBL100"
            }
          ],
          "n01": [
            {
              "id": "UniProtKB:P05181"
            }
          ]
        },

(in the case where there are two input curies and one is a descendant of another, just have two entries in the list)

Second proposal:

        "node_bindings": {
          "n00": [
            {
              "id": { "CHEMBL.COMPOUND:CHEMBL112": "CHEMBL.COMPOUND:CHEMBL100" }
            }
          ],
          "n01": [
            {
              "id": "UniProtKB:P05181"
            }
          ]
        },

Note that one node in the KG could be a descendant of two input ids.

edeutsch · 2021-11-04T17:28:03Z

Suppose there are two input query ids 100 and 200. The server returns 212, which is a descendant of both 100 and 200)
First proposal:

    "node_bindings": {
      "n00": [
        {
          "id": "CHEMBL.COMPOUND:CHEMBL212",
          "query_id": "CHEMBL.COMPOUND:CHEMBL100"
        },
        {
          "id": "CHEMBL.COMPOUND:CHEMBL212",
          "query_id": "CHEMBL.COMPOUND:CHEMBL200"
        }
      ],
      "n01": [
        {
          "id": "UniProtKB:P05181",
          "query_id": null
        }
      ]
    },

(in the case where there are two input curies and one is a descendant of another, just have two entries in the list)

edeutsch · 2021-11-04T17:28:57Z

Suppose there are two input query ids 100 and 200. The server returns 212, which is a descendant of both 100 and 200)
Second proposal:

    "node_bindings": {
      "n00": [
        {
          "id": { "CHEMBL.COMPOUND:CHEMBL212": "CHEMBL.COMPOUND:CHEMBL100" }
        },
        {
          "id": { "CHEMBL.COMPOUND:CHEMBL212": "CHEMBL.COMPOUND:CHEMBL200" }
        }
      ],
      "n01": [
        {
          "id": { "UniProtKB:P05181": null }
        }
      ]
    },

Note that one node in the KG could be a descendant of two input ids.

edeutsch · 2021-11-04T17:48:13Z

Third proposal:

"node_bindings": {
  "n00": {
      "CHEMBL.COMPOUND:CHEMBL212": {
         "query_id": [ "CHEMBL.COMPOUND:CHEMBL100", "CHEMBL.COMPOUND:CHEMBL200" ]
       }
   },
  "n01": {
      "UniProtKB:P05181": null
  }
},

cbizon · 2021-11-04T17:57:47Z

Proposal 4:

"node_bindings": {
  "n00": {
      "CHEMBL.COMPOUND:CHEMBL212":  [
                 {"query_id": "CHEMBL.COMPOUND:CHEMBL100"}, 
                 {"query_id": "CHEMBL.COMPOUND:CHEMBL200"}
          ]
       }
   },
  "n01": {
      "UniProtKB:P05181": null
  }
},

vgardner-renci · 2021-11-04T17:57:51Z

As of 10:30 11/18
Proposal 1: 6
Proposal 2: -2
Proposal 3: 0
Proposal 4: 3 pts

Heart = 2 pts
Thumbs up = 1 pt
Thumbs down = -1 pt

edeutsch · 2022-09-08T17:35:24Z

addressed in PR #304

colleenXu mentioned this issue Sep 17, 2021

Add rate limit and batch-size limit NCATSTranslator/translator_extensions#11

Merged

kennethmorton mentioned this issue Nov 18, 2021

Prototype potential potential changes for many to many mappings ranking-agent/strider#341

Closed

edeutsch mentioned this issue Dec 1, 2021

Fix NodeBinding description and add NodeBinding.query_id #304

Merged

vdancik added this to the v1.3 milestone Aug 25, 2022

edeutsch closed this as completed Sep 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unambiguous (un)batching #293

Unambiguous (un)batching #293

patrickkwang commented Sep 9, 2021

patrickkwang commented Sep 9, 2021

edeutsch commented Sep 16, 2021 •

edited

Loading

edeutsch commented Sep 16, 2021 •

edited

Loading

patrickkwang commented Sep 16, 2021 •

edited

Loading

marcdubybroad commented Sep 16, 2021

vdancik commented Sep 16, 2021

brettasmi commented Sep 16, 2021 •

edited

Loading

edeutsch commented Sep 16, 2021

edeutsch commented Nov 4, 2021 •

edited

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

cbizon commented Nov 4, 2021 •

edited by vgardner-renci

Loading

vgardner-renci commented Nov 4, 2021 •

edited

Loading

edeutsch commented Sep 8, 2022

Unambiguous (un)batching #293

Unambiguous (un)batching #293

Comments

patrickkwang commented Sep 9, 2021

patrickkwang commented Sep 9, 2021

edeutsch commented Sep 16, 2021 • edited Loading

edeutsch commented Sep 16, 2021 • edited Loading

patrickkwang commented Sep 16, 2021 • edited Loading

marcdubybroad commented Sep 16, 2021

vdancik commented Sep 16, 2021

brettasmi commented Sep 16, 2021 • edited Loading

edeutsch commented Sep 16, 2021

edeutsch commented Nov 4, 2021 • edited Loading

edeutsch commented Nov 4, 2021 • edited by vgardner-renci Loading

edeutsch commented Nov 4, 2021 • edited by vgardner-renci Loading

edeutsch commented Nov 4, 2021 • edited by vgardner-renci Loading

cbizon commented Nov 4, 2021 • edited by vgardner-renci Loading

vgardner-renci commented Nov 4, 2021 • edited Loading

edeutsch commented Sep 8, 2022

edeutsch commented Sep 16, 2021 •

edited

Loading

edeutsch commented Sep 16, 2021 •

edited

Loading

patrickkwang commented Sep 16, 2021 •

edited

Loading

brettasmi commented Sep 16, 2021 •

edited

Loading

edeutsch commented Nov 4, 2021 •

edited

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

edeutsch commented Nov 4, 2021 •

edited by vgardner-renci

Loading

cbizon commented Nov 4, 2021 •

edited by vgardner-renci

Loading

vgardner-renci commented Nov 4, 2021 •

edited

Loading