-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give better chemical labels to returned responses #462
Comments
@gaurav - is this something for Name Resolver? |
Tagging the ace team David and Gaurav. |
It isn't clear what the UI team can do about this issue. Is the idea of a "canonical name" available in the attribute server? @newgene |
I think Jenn means that the returned results were not normalized properly. I used Jenn's PK to load back the results using ARAX CI UI (note this is an "old" query and the ARAS are falling the validation): I RETESTED on test today and the unusual name is still popping up: I looked at RENCI name resolver for monoclonal antibody AN100226 and found that the 2 identifiers instances gets properly pooled together. Natalizumab is part of the synonyms but is not the label. I do not know what is the rule for deciding the drug label, but my guess is that the drug label is decided at the Node Norm stage, so that is a NodeNorm issue? EDIT:
|
@sandrine-m the UI does not do any normalization, we use the normalization the ARS provides. The ARS relies on the node normalizer so most likely it is an issue with that service @gaurav @cbizon |
From conversation through slack: |
I think we should move away from tickets with open ended definitions of success. "Give better labels" is way too broad and basically can never be finished. It would be better to create tickets with a finite set of items that should be corrected. |
@gprice1129 If I understand correctly, I think what you're pointing out is that we can't implement this until we define what output is expected, and whether it's possible to do it.
|
Re: deciding on the label for nodeNorm. my understanding was that sometimes, nodeNorm choosen label is not the user preferred one. Although this issue cannot be fixed right away (longterm issue, perhaps needing some user surveys as Jenn is pointing out) , I started a test asset sheet for testing chemical names based on a few searches I made using the system. Please note that this sheet was done back in November 2023 I think so perhaps the system changed since then. MolePro team was interested particularly into looking at chemical labels choosen differently between MolePro and NodeNorm to see how we can improve our system. |
@jh111 Having a definition for "better chemical labels" would definitely be a good idea, however, even if we had a perfect definition for "chemical label" its still unclear when the ticket can be closed: Are we talking about all the chemical labels in the system right now or all of them for all time? In my opinion it would be better if we constrained tickets of this nature to some finite set of chemical labels so whoever is working on it can have a clear goal. |
I have put on a better title, to reflect the problem/opportunity with experience for specific users, and the fact different users might want different names. There are several different technical options for how this could be addressed. For the INN, for I think RxNorm ingredient would be a fine level of detail. For example, inFLIXimab, as opposed to inFLIXimab-abda. I don't think we need to use the uppercase (which is designed for prescription safety). |
I think this is a node norm issue. We display whatever the canonical name is. So, @gaurav can you tell us what the rules are for this? Then maybe @jh111 can see if there are examples where that are not optimized and if optimizing those would break other terms? So, the rubric could change. However, I don't think this is a UI issue. |
Another example of suboptimal labeling is using the name "Activated Charcoal" for carbon: The rule that's being applied is to get the name from each source and then rank them by the same source priority as used in biolink to pick which curie is the best one. |
When you say source, do you mean original sources or each team within Translator? Would it be useful then to collect the name that each source provides and learn a rule (=set of weights) that best predict the user liking (=the desired result in the test asset sheet?) The idea being that some sources have more user-friendly naming strategies than others (=higher weights). |
To deal with the simpler issue first, CHEBI:27594 "CHARCOAL, ACTIVATED" still has the wrong label (should be "carbon"). This is because we prefer CHEMBL.COMPOUND labels over others. I think I've seen other examples of CHEMBL labels being suboptimal; I wonder if we should promote CHEBI above it and see if that improves this situation (it should definitely fix this bug). I'm going to look for other reports of this before deciding whether to try this. Now for the more complex issue: UMLS:C0665297 is present twice in NodeNorm Test -- once in a UMLS-only Protein clique, and once in a UMLS+MESH ChemicalEntity clique. These should really be merged into a single clique, but proteins and chemicals are currently produced by independent modules, so there isn't any way to merge those cliques given how NodeNorm is currently architected.
|
Is there a way we can gather all of the examples together to look at the flavors we are talking about? @gaurav Do you have a dart board or a stress ball where you keep all of our complaints (or other place). I would be interested in seeing how to break these down and then look at the some examples from each group. |
@Genomewide I started this sheet on my side (to become perhaps a set of tests in future for @gaurav ) it does not contain all examples and surely Gaurav has a lot more |
How do I find what to put for Molpro? I added asset # 25 |
Thank you for adding a row to the sheet. |
Thanks, @sandrine-muller-research! My list is actually much shorter :) I'll start moving your entries over in Hammerhead. |
Just putting these here in case people are unaware of other convos: |
Thank you Colleen! |
The original example is resolved, Natalizumab is displayed, nothing is returned for text search monoclonal or antibody I ran a query which has Activated Charcoal in it, and there's only one result, not all of the other forms. So these examples look resolved, there are other tickets for other symptoms. Closing this one for Hammerhead |
Search What drug may treat Multiple Sclerosis.
https://ui.test.transltr.io/results?l=Multiple%20Sclerosis&i=MONDO:0005301&t=0&q=bf9d0342-0966-4cec-8122-8d87187b1ef3
One of the answer that comes up is Monoclonal antibody an100226.
This is the early name/number for natalizumab. It will be much more helpful for users to have this normalized to the current name, natalizumab.
Options:
The text was updated successfully, but these errors were encountered: