Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looks Like CQS Results in the UI are returned by KPs as "Reasoning Agents" #805

Closed
sstemann opened this issue Jun 25, 2024 · 18 comments
Closed
Assignees
Labels

Comments

@sstemann
Copy link

This is one, example for any MVP1 that CQS responds to you will see this behavior:

  1. In UI Test (I saw similar behavior in CI), run MVP1 > Cystic Fibrosis

https://ui.test.transltr.io/main/results?l=Cystic%20Fibrosis&i=MONDO:0009061&t=0&r=0&q=8b6ff95b-2a6a-4d34-a42f-a6a1a01f8ab8

PK: 8b6ff95b-2a6a-4d34-a42f-a6a1a01f8ab8

Notice - In the UI - Reasoning Agent now has options for Service Provider and MolePro - prior to enabling CQS, only ARAs were options in the Reasoning Agent facet

image
  1. Plop that PK in the ARAX GUI and notice CQS responds (but isnt listed in the UI-UI as a Reasoning Agent)
image
  1. It's easiest to find a somewhat isolate result (ie from MolePro but not from other Reasoning Agents) - so Idrevloride
image
  1. Which looks like this in the ARAX GUI - there are two results, 114 and 639
    The Result Graph for 639 with 1 Support Graph
image
  1. The edge support graph
image

I dont know whats going on or where the problem is, but this can't be right.

@karafecho
Copy link

karafecho commented Jun 25, 2024

Thanks for pointing this out, Sarah.

This is a packed issue that needs to get broken down a bit, but for now, I'll note a few things:

  1. I'm tagging @jdr0887 as he is the lead developer of the CQS.
  2. I'm not sure why the UI is displaying Molecular Data Provider and Service Provider TRAPI as "Reasoning Agents" instead of the CQS. We (Matt, Sierra, Max, Jason, me) thoroughly tested the CQS in CI, including checking the structure of the CQS JSON output, and everything seemed fine.
  3. I'm not sure why there are two results for idrevloride, but the predicates in the edges for the support graphs in the 114 result don't look right to me. That's the only duplicate I see, so my guess is that this is not a global issue.
  4. The 639 result and supporting graph look correct to me. (This result is reflective of one of the CQS treats refactor templates.)

@karafecho
Copy link

Per Gus and Andy (Slack), if the CQS is acting like an ARA, then it needs to the primary knowledge source on inferred edges. That is what is causing MolePro and TRAPI Service Provider to show up as reasoning agents.

Looping in @mbrush, as I'm not sure the suggestion above fit with the "treats refactor" model that he had in mind for the CQS. See this G-sheet.

To Sarah's question about where the CQS results are originating, the CQS is currently supporting 8 "treats refactor" templates (see https://github.com/TranslatorSRI/CQS/tree/main/templates/mvp1-templates), so results from the curated templates for which the CQS was conceptualized (e.g., mvp1-template1-clinical-kps) are being drowned out by the "treats refactor" results. We are considering approaches to rectify this unanticipated effect.

We have a CQS HELP desk call on Friday at 1 pm ET, if you all wish to discuss Sarah's issue then.

@gprice1129
Copy link
Collaborator

gprice1129 commented Jun 26, 2024

I see what you're saying @karafecho. This is confusing to me @mbrush. I'm not sure why the primary_knowledge_source for an edge generated by the CQS would ever by anything other than the CQS.

@karafecho
Copy link

karafecho commented Jun 26, 2024

To clarify my last post and hopefully resolve some confusion expressed in Slack, the issue re primary vs aggregator knowledge source is something that Matt and the UI team need to resolve, but the fact that the CQS is returning responses from, e.g., MolePro, is not an issue but rather an expected result. The fact that results from the curated templates for which the CQS was conceptualized (e.g., mvp1-template1-clinical-kps) are being drowned out by the "treats refactor" results also is not an issue per se, but rather a consequence of scoring that I'm not thrilled with (i.e., as with other ARAs, the "treats refactor" results are based on one-hop inferences and so are being scored higher than the more complex curated results). We are developing alternative approaches such as multiple instances of the CQS to address the scoring impact.

@mbrush
Copy link

mbrush commented Jun 28, 2024

why the primary_knowledge_source for an edge generated by the CQS would ever by anything other than the CQS

This was widely discussed and we felt that since it is the KP that does all the work to define when and how to generate these one-hop predictions that resulted from the 'treats' refactor, and they only use the CQS to 'write down their answer' - that the KP should be credited as the primary source. We did not anticipate the implications this would have in the UI.

I don't think anyone felt so strongly about whether the KP or the CQS was the primary source - so it may be that the simple fix to this is to have the CQS make themselves the primary source, and the KP a supporting data source - so they are still credited for their work.

Alternatively, we keep primary source the KP, but the UI implements a rule that displays the CQS in their 'Reasoning Agent' filter for any prediction coming form the CQS service (regardless of what the TRAPI says is the primary source).

@karafecho
Copy link

karafecho commented Jun 28, 2024

Following up from the CQS HELP desk discussion with Matt, Sierra, Gus, and Jason, here are the action items:

  1. Jason will update the "treats refactor" templates such that the CQS is the primary knowledge source and the KPs that provide the inference rules are the supporting data sources, contingent on Matt's confirmation that the contributing KPs agree to this plan
  2. Jason will update CQS results to indicate KL as "prediction" and AT as "computational model"
  3. [Unrelated to this ticket, but worth noting, just to avoid any further confusion] Jason will update CQS code and templates such that it copies over "biolink:publications" and "biolink:max_research_phase" for the "treats refactor" edges within an attribute block, for propagation from foundational edges contributed by supporting data source(s)
,
    "attribute_type_ids": [
      "biolink:publications",
      "biolink:max_research_phase"
    ]
  }
  1. Matt/Sierra will coordinate with relevant KPs regarding non-CQS KP-specific issues re the "treats" refactor (e.g., redundant ChEMBLs)
  • Part of this effort will involve migrating Clinical Trials KP to Automat and deprecating all of the ChEMBL edges
  1. Kara will bring proposal to Architecture for modified CQS scoring approach (e.g., apply weights to scores for various templates, score top X% results from curated templates higher, create multiple instances of the CQS to support different use cases) in order to surface results from the curated templates that are currently being drowned out by the "treats refactor" results, the goal being to surface the curated template results in the UI so that we have an opportunity to solicit broader community feedback

@karafecho
Copy link

karafecho commented Jul 1, 2024

Update:

Action items (1), (2), and (3) from above post were completed on Friday, 6/28, with fixes tested in DEV on Monday, 7/1. An ITRB request to push to CI has been submitted.

Additional update:

Confirmed on Tuesday, 7/2 that ITRB pushed the latest version of the CQS to CI.

@jdr0887
Copy link

jdr0887 commented Jul 3, 2024

As CQS has 1, 2, & 3 rectified from the bullets above, is there any reason the ui.ci.transltr.io app would not now list/include CQS?

@karafecho
Copy link

karafecho commented Jul 3, 2024

@sstemann and others: The CI environment appears to be very unstable right now. I'm receiving either errors without PKs, incomplete results (meaning the UI never finishes running a query), or weird results that seem to reflect the older instance of the CQS (the version before the one that has been confirmed as deployed in CI). According to Mark, Shervin and ITRB are working on some Jaeger deployment issues in CI, which appear to be the source of the UI CI issues. As such, I don't think any additional testing on our end is warranted until @ShervinAbd92 confirms that the CI environment is stable. Thanks, all!

@ShervinAbd92
Copy link

Hi @karafecho, ARS is stable now and deployed on Jaeger. we still have to figure out how to show async tool traces under their initial trace rather than separated on their own. I will let you know in outages channel when we are going to try for that. you can resume testing on ARS.

@karafecho
Copy link

Great! Thanks for the update, Shervin. I did not know there was an outages channel. I just joined.

@karafecho
Copy link

karafecho commented Jul 9, 2024

Update:

With the CI stability issues and an additional ARS issue resolved, I went ahead and reran the cystic fibrosis query at ui.ci.tranlstr.io. Here is my summary of findings:

  • The CQS is not listed as a Reasoning Agent; rather, Molecular Data Provider and Service Provider TRAPI are, even though those should be displayed as supporting data sources for the primary knowledge source infores:cqs
    o As a result, the answers that are returned by those two providers appear as lookups but with non-treats edges (e.g., in clinical trials for)
  • There is variation in the number of complete results returned by the UI in CI, which may not be surprising, but the differences seem large for three queries that were run sequentially
  • PK1=f297522e-7bb1-49fa-9567-6ebd02ea90d2 [complete results from first query]
    o Molecular Data Provider=45 answers, Service Provider TRAPI=44 answers
  • PK2= 9d3c5404-b2f8-44a9-ad52-7d8e4dae498b [complete results from second query]
    o Molecular Data Provider=19 answers, Service Provider TRAPI=10 answers
  • PK3=4f1a90f0-dcff-4aac-a827-7e26c9a42188 [complete results from third query]
    o Molecular Data Provider=45 answers, Service Provider TRAPI=44 answers
  • The ARAX viewer indicates 134 results for the CQS with all three PKs, with the CQS correctly listed as the primary knowledge source for inferred treats edge and supporting graphs, supporting data sources such as service-provider-trapi shown, KL/AT correctly listed as prediction/computational model, and publication attributes such as max_research_phase correctly listed
  • The raw JSON output also does not list the CQS as an aggregator knowledge source, only a primary knowledge source
  • The drug that Sarah highlighted in her initial post (idrevloride) can be found in the ARAX viewer and raw JSON but not the Translator UI results
  • infores:CQS has been added to the Infores Catalog; an open PR exists for assigning KL/AT (Update infores_catalog.yaml biolink/information-resource-registry#16)

I am not exactly sure what's going on, but I think it would be helpful to have the UI team take a look. @gprice1129 @dnsmith124, @Genomewide, @NishadPrakash, perhaps you all can chime in?

@sstemann
Copy link
Author

when i look at the EPC of the same result from CQS and from BTE - edge sources seem to be mapped differently - I believe ARAs populate aggregator_knowledge_source with their ARA. If CQS needs this to be mapped differently, that should be a discussion to change this standard. While I did CQS vs BTE below, you can see that ARAX also follows this standard if you compare tobramycin between CQS and ARAX. Mannitol (one of the many edge support graphs) from ARAGORN goes through MolePro and you can see the difference in EPC modeling between CQS.

If aggregator_knowledge_source is not required on the inferred path - I think this affects all other ARAs.

Inferred Path CQS BTE
Predicate Treats In Clinical Trials For
primary_knowledge_source: infores: cqs infores:chembl
supporting_data_source: service-provider-trapi
aggregator_knowledge_source mychem-info
aggregator_knowledge_source biothings-explorer
Edge Support Path CQS BTE
Predicate In Clinical Trials For None
primary_knowledge_source: chembl
supporting_data_source:
aggregator_knowledge_source arax
aggregator_knowledge_source mychem-info
aggregator_knowledge_source service-provider-trapi
aggregator_knowledge_source cqs

CQS
image

image

BTE
image

@mbrush
Copy link

mbrush commented Jul 10, 2024

Hi. I've said this before and will say it again now - I think we should pause on testing/reporting any issues for MVP1 related to results based on Chembl and their ingest of clinical trials data. It has become clear from talking with KPs and ARAs that teams are at different stages or implementing the treats refactor (e.g. KG@ has yet to implement), and some inconsistencies remain in how it is being done (e.g. SPOKE not reporting clinical Trials ids appropriately). On top of this, we know that these refactored chembl ingests will all be replaced by the CTKP in the next month or so anyway. Trying to QA during this transition period is IMO not a good use of our time - given that the CTKP swap-out will result in things looking very different (in a good way - cleaner, less duplication, less inconsistency) - and many issues that testers are painstakingly documented will be fixed as a result of this swap-out.

I propose that QA be directed to MVP2 over the next month, and we make a concerted push get the CTKP swap out complete ASAP. Once this has been done and made its way into test, we can shift testing focus back to MVP1.

I am happy to help coordinate the moving parts here - KPs needing to swap out their chembl ingests, Gwenlyn/Multiomics in their CTKP implementation and move over to automat, the CQS team in ensuring that 'treats' predictions continue to get generated correctly, and proper EPC reported after the swap.

Maybe we can discuss on the next TACT call - and agree on a plan that will be most efficient here? I copied my comments above into an issue in the TACT repo to ensure we follow up with it there.

@karafecho
Copy link

karafecho commented Jul 10, 2024

+1 to Matt's comments and proposals. I had suggested a small-group meeting in Slack, but I think Matt's suggestion of bringing the issue(s) to TACT makes more sense.

@gprice1129
Copy link
Collaborator

@jdr0887 in our last call I had told you that the primary_knowledge_source on the edges are the only thing that needs to change for the CQS to show up in the facets but that is incorrect (so sorry about this!). The resource_id attached to the analysis also needs to change to infores:cqs

@sierra-moxon
Copy link
Member

@mbrush - Can we work out the CQS provenance issue so that we can disambiguate this issue from the issue with redundant Chembl ingests?

@sierra-moxon sierra-moxon assigned mbrush and unassigned sierra-moxon Jul 11, 2024
@sierra-moxon sierra-moxon added Fugu (Sprint 4) - due July 19 in CI This ticket will be fixed in CI by the end of Fugu/Sprint 4 (July 19) and removed performance labels Jul 11, 2024
@karafecho
Copy link

Jason addressed all CQS provenance-related issues that were raised in this thread. The other issues relate to ChEMBL and the treats refactor. As such, I'm closing this issue.

@sierra-moxon sierra-moxon removed the Fugu (Sprint 4) - due July 19 in CI This ticket will be fixed in CI by the end of Fugu/Sprint 4 (July 19) label Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants