Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check for database definition cross references to also be a general cross reference #28523

Open
Tracked by #28763
pgaudet opened this issue Jul 17, 2024 · 10 comments · May be fixed by #29161
Open
Tracked by #28763

Add check for database definition cross references to also be a general cross reference #28523

pgaudet opened this issue Jul 17, 2024 · 10 comments · May be fixed by #29161
Assignees
Labels

Comments

@pgaudet
Copy link
Contributor

pgaudet commented Jul 17, 2024

(to address #28146 (comment) requested by @sjm41 )

There are many cases where the definition cross references and the general cross references don't match.

Minimally for

  • RHEA
  • EC
  • MetaCyc
  • KEGG
  • UM-BBD
  • TC

We should have a check that any def xref to these databases should ALSO be an general xref.

Thanks, Pascale

@sjm41
Copy link
Contributor

sjm41 commented Jul 18, 2024

Hi @pgaudet

We should have a check that any def xref to these databases should ALSO be an general xref.

Agreed, though I think my #28146 (comment) was for a slightly different check - i.e check for terms that have the same MetaCyc/KEGG (or indeed RHEA/EC) definition xref.

But I guess your suggested check will effectively do the same thing - if we ensure that a def xref ALSO appears as a general xref, then any duplicate def xrefs will get flagged up by the existing checks for duplicate general xrefs.

So, all is probably fine!

@pgaudet
Copy link
Contributor Author

pgaudet commented Jul 19, 2024

Right, this is what I meant - once all def xrefs are also general xrefs then we only need to check general xrefs for duplications.

@pgaudet
Copy link
Contributor Author

pgaudet commented Nov 7, 2024

Google spreadsheet with errors:
https://docs.google.com/spreadsheets/d/1jMfBoiSRrT-Ycx6TEA9laLDLL9Tj1cFqp9V9WX3mcyk/edit?gid=0#gid=0

Def xrefs and general cross refs dont match (306 errors)

@pgaudet
Copy link
Contributor Author

pgaudet commented Nov 7, 2024

Obsolete:

pgaudet added a commit that referenced this issue Nov 13, 2024
pgaudet added a commit that referenced this issue Nov 13, 2024
@sjm41
Copy link
Contributor

sjm41 commented Nov 14, 2024

Fixes cases where MetaCyc xref on def is obsolete at MetaCyc:

GO ID def xref Notes
GO:0047527 MetaCyc:1,3-DIHYDROXYBENZOATE--SERINE-LIGASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047517 MetaCyc:1,4-BETA-D-XYLAN-SYNTHASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047026 MetaCyc:1.1.1.213-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047042 MetaCyc:1.1.1.50-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050589 MetaCyc:1.14.11.19-RXN Obsolete at MetaCyc - remove from GO def.
GO:0034078 MetaCyc:125-PWY Obsolete at MetaCyc - remove from GO def.
GO:0047229 MetaCyc:2.4.1.158-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047234 MetaCyc:2.4.1.166-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047238 MetaCyc:2.4.1.175-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047244 MetaCyc:2.4.1.187-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047254 MetaCyc:2.4.1.202-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050510 MetaCyc:2.4.1.226-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047269 MetaCyc:2.4.1.70-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047276 MetaCyc:2.4.1.87-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047356 MetaCyc:2.7.8.14-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047525 MetaCyc:2'-HYDROXYDAIDZEIN-REDUCTASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047404 MetaCyc:3.2.1.56-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047573 MetaCyc:4-ACETAMIDOBUTYRATE-DEACETYLASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047468 MetaCyc:5.4.2.5-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047623 MetaCyc:ADENOSINE-PHOSPHATE-DEAMINASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047631 MetaCyc:ADP-RIBOSE-PYROPHOSPHATASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0043039 MetaCyc:Aminoacyl-tRNAs Obsolete at MetaCyc - remove from GO def.
GO:0047763 MetaCyc:CAFFEATE-O-METHYLTRANSFERASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0061681 MetaCyc:ENTNER-DOUFDOROFF-PWY-III Obsolete at MetaCyc - remove from GO def.
GO:0047895 MetaCyc:FORMALDEHYDE-DISMUTASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047930 MetaCyc:GLUCOSAMINATE-AMMONIA-LYASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050417 MetaCyc:GLUTAMINASE-ASPARAGIN-ASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047725 MetaCyc:INULOSUCRASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0048000 MetaCyc:ISOFLAVONE-3'-HYDROXYLASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0017084 MetaCyc:PROLINE-MULTI Obsolete at MetaCyc - remove from GO def.
GO:0019517 MetaCyc:PWY-901 Obsolete at MetaCyc - remove from GO def.
GO:0046499 MetaCyc:S-ADENOSYLMETHIONINAMINE Obsolete at MetaCyc - remove from GO def.
GO:0050306 MetaCyc:SUCROSE-1F-FRUCTOSYLTRANSFERASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050309 MetaCyc:SUGAR-TERMINAL-PHOSPHATASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050316 MetaCyc:T2-INDUCED-DEOXYNUCLEOTIDE-KINASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050350 MetaCyc:TRIHYDROXYSTILBENE-SYNTHASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050369 MetaCyc:TYROSINE-3-MONOOXYGENASE-KINASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0050402 MetaCyc:XYLONO-14-LACTONASE-RXN Obsolete at MetaCyc - remove from GO def.
GO:0047979 MetaCyc:HEXOSE-OXIDASE-RXN Obsolete at MetaCyc - remove from GO def. Add MetaCyc:RXN-17229, RXN-13740, RXN-13741 as narrowMatch xrefAlso, rename GO:0046562 glucose oxidase activity to beta-D-glucose oxidase activity
GO:0047792 MetaCyc:CYANOHYDRIN-BETA-GLUCOSYLTRANSFERASE-RXN Obsolete at MetaCyc - remove from GO def. Add MetaCyc:RXN-743 as dbxref
GO:0047403 MetaCyc:3.2.1.140-RXN Obsolete at MetaCyc - remove from GO def. Add MetaCyc:RXN-8731 as xref.

sjm41 added a commit that referenced this issue Nov 14, 2024
Removing obsolete MetaCyc xrefs as def xrefs. #28523
@sjm41
Copy link
Contributor

sjm41 commented Nov 14, 2024

Fixing other cases where MetaCyc def xref needs to be removed:

GO ID def xref Notes
GO:0097294 MetaCyc:IMP-DEHYDROG-RXN Remove from GO def. (Already correctly on GO:0003938 MF term)
GO:0047266 MetaCyc:2.4.1.53-RXN Remove from GO def. (Associated with EC:2.4.1.70 at MetaCyc - correctly an xref on GO:0047269.)
GO:0005960 MetaCyc:GCVMULTI-CPLX Remove from GO def. (Complex entry at MetaCyc)
GO:0034079 MetaCyc:BUTANEDIOL Remove from GO def. (Compound entry at MetaCyc)
GO:0034077 MetaCyc:BUTANEDIOL Remove from GO def. (Compound entry at MetaCyc)
GO:0034078 MetaCyc:BUTANEDIOL Remove from GO def. (Compound entry at MetaCyc)
GO:0019434 MetaCyc:DIGLUCODIACETYL-DOCOSANOATE Remove from GO def. (Compound entry at MetaCyc)
GO:0006675 MetaCyc:MIPC Remove from GO def. (Compound entry at MetaCyc)
GO:0019329 MetaCyc:AMMOXID-PWY Remove from GO def. (Correctly associated with child GO term.)
GO:0019417 MetaCyc:FESULFOX-PWY Remove from GO def. (Correctly associated with child GO term.)
GO:1901790 MetaCyc:HCAMHPDEG-PWY Remove from GO def. (Correctly associated with child GO term.)
GO:0019329 MetaCyc:P303-PWY Remove from GO def. (Correctly associated with child GO term.)
GO:0019329 MetaCyc:PWY-2242 Remove from GO def. (Correctly associated with child GO term.)
GO:0033388 MetaCyc:PWY-46 Remove from GO def. (Correctly associated with child GO term.)
GO:0033331 MetaCyc:PWY-5032 Remove from GO def. (Correctly associated with child GO term.)
GO:0033383 MetaCyc:PWY-5122 Remove from GO def. (Correctly associated with child GO term.)
GO:0033329 MetaCyc:PWY-5320 Remove from GO def. (Correctly associated with child GO term.)
GO:0033302 MetaCyc:PWY-5321 Remove from GO def. (Correctly associated with child GO term.)
GO:0046494 MetaCyc:PWY-761 Remove from GO def. (Correctly associated with child GO term.)
GO:0004779 MetaCyc:SULFATE-ADENYLYLTRANS-RXN Remove from GO def. (Correctly associated with child GO term.)
GO:0004779 MetaCyc:SULFATE-ADENYLYLTRANSFERASE-ADP-RXN Remove from GO def. (Correctly associated with child GO term.)
GO:0019417 MetaCyc:SULFUROX-PWY Remove from GO def. (Correctly associated with child GO term.)
GO:0033321 MetaCyc:PWY-1186 Remove from GO def. (Correctly associated with child GO term.) Change name of the child term (GO:0033322) to "L-homomethionine..." to match MetaCyc
GO:0035446 MetaCyc:RXN1G-4 Remove from GO def. (Correctly associated with different GO term.)
GO:0019665 MetaCyc:Fermentation Remove from GO def. (Correctly associated with parent GO term.)
GO:0019662 MetaCyc:Fermentation Remove from GO def. (Correctly associated with parent GO term.)
GO:0033386 MetaCyc:PWY-5120 Remove from GO def. (single-step - correctly associated with GO:0004311 MF term.)
GO:0033385 MetaCyc:PWY-5120 Remove from GO def. (single-step - correctly associated with GO:0004311 MF term.)
GO:0033499 MetaCyc:PWY-3821 Remove from GO def. (Term already has correct xref of PWY-6317)

sjm41 added a commit that referenced this issue Nov 14, 2024
Removing incorrect MetaCyc xrefs as def xrefs. #28523
@sjm41
Copy link
Contributor

sjm41 commented Nov 14, 2024

Fixing cases where MetaCyc xref in def needs to be added as dbxref:

GO ID def xref Notes
GO:1901794 MetaCyc:PWY0-1277 Add as broadMatch xref
GO:1901791 MetaCyc:HCAMHPDEG-PWY Add as broadMatch xref.
GO:1901782 MetaCyc:PWY-5273 Add as dbxref
GO:1901815 MetaCyc:PWY-5288 Add as dbxref
GO:1901812 MetaCyc:PWY-5943 Add as dbxref
GO:1901827 MetaCyc:PWY-5944 Add as dbxref
GO:1901830 MetaCyc:PWY-6288 Add as dbxref
GO:0090638 MetaCyc:PWY-6825 Add as dbxref
GO:1901780 MetaCyc:PWY-6915 Add as dbxref
GO:1901774 MetaCyc:PWY-6955 Add as dbxref
GO:1901802 MetaCyc:PWY-6992 Add as dbxref
GO:1901158 MetaCyc:PWY-7016 Add as dbxref
GO:0090640 MetaCyc:PWY-7470 Add as dbxref
GO:1990055 MetaCyc:RXN-8990 Add as dbxref
GO:1901833 MetaCyc:PWY-6809 Add as dbxref (and rename/redefine/change LD to match current MetaCyc "neoxanthin biosynthesis")
GO:1901866 MetaCyc:PWY-5174 Add as dbxref (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:1901809 MetaCyc:PWY-5174 Add as dbxref (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:0019287 MetaCyc:PWY-922 Add as dbxref, and remove from GO:0010142
GO:0033389 MetaCyc:PWY-40 Add as dbxref & tweak LD.
GO:0033390 MetaCyc:PWY-43 Add as dbxref & tweak LD.
GO:1901824 MetaCyc:PWY-5946 Add as dbxref. Also need to redefine/rename/change LD to reflect MetaCyc ("alpha carotene")
GO:0033387 MetaCyc:PWY-46 Add as dbxref. Also need to redefine/rename/change LD to reflect MetaCyc ("from arginine, via ornithine")
GO:0006601 MetaCyc:GLYCGREAT-PWY Add as dbxref. Remove from GO:0019463 and obsolete that term (no annotations)
GO:1901788 MetaCyc:PWY-1361 Add as narrow dbxref
GO:1901788 MetaCyc:P321-PWY Add as narrow dbxref, remove from GO:0018874
GO:1901788 MetaCyc:CENTBENZCOA-PWY Add as narrow dbxref, remove from GO:0018875
GO:0008153 MetaCyc:PWY-6543 Add as narrowMatch dbxref (and rename to match MetaCyc - "4-aminobenzoate biosynthesis"Also add PWY-8276 and PWY-8281 as narrowMatch.
GO:0019664 MetaCyc:FERMENTATION-PWY Add this as dbxref, remove from GO:0006113
GO:0006113 MetaCyc:Fermentation Add this as dbxref, remove MetaCyc:FERMENTATION-PWY

@sjm41
Copy link
Contributor

sjm41 commented Nov 15, 2024

Fixing remaining MetaCyc xref issues:

GO ID def xref Notes
GO:0090439 MetaCyc:RXN-12183 Change def xref to MetaCyc:RXN-16407
GO:0006601 MetaCyc:PWY-6158 One-step reaction - add as relatedMatch on GO:0004111 (MF)
GO:0052776 MetaCyc:PWY-6855 PWY-6855 is "chitin degradation I (archaea)" and is currently a narrowMatch xref on GO:0006032 chitin catabolic process. So, make GO:0052776 a child of GO:0006032 and edit xrefs as appropriate. Also need to redefine/rename/change LD of GO:0052776 to match current PWY-6855 description.
GO:1901864 MetaCyc:PWY-5174 Remove from GO def (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:1901865 MetaCyc:PWY-5174 Remove from GO def (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:1901807 MetaCyc:PWY-5174 Remove from GO def (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:1901808 MetaCyc:PWY-5174 Remove from GO def (PWY-5174 = capsanthin and capsorubin biosynthesis, so should be broadMatch on just GO:1901866 capsorubin biosynthetic process and GO:1901809 capsanthin biosynthetic process)
GO:0033359 MetaCyc:DAPLYSINESYN-PWY- Remove trailing hyphen

sjm41 added a commit that referenced this issue Nov 15, 2024
Fixing MetaCyc def xrefs as db xrefs. #28523
@sjm41
Copy link
Contributor

sjm41 commented Nov 15, 2024

GO:0140859 | RHEA:35811 |
RHEA:35811 is a specific reaction, remove from def xref and replace with current dbxref RHEA:35407

sjm41 added a commit that referenced this issue Nov 15, 2024
sjm41 added a commit that referenced this issue Nov 15, 2024
@sjm41
Copy link
Contributor

sjm41 commented Nov 15, 2024

Fixing UM-BBD def xrefs:

GO ID def xref Notes
GO:0008695 UM-BBD_enzymeID:e0307 Replace current UM-BBD dbxref with this one.
GO:0009390 UM-BBD_enzymeID:r0207 This is a complex - replace with PMID:21357619 and PMID:3280546
GO:0018901 UM-BBD_pathwayID:2,4d Remove UM-BBD xrefs from this term and move to catabolic child (GO:0046300) - dbxref should be '2,4-d'
GO:0019490 UM-BBD_pathwayID:abs Add as dbref; replace def xref with MetaCyc:2ASDEG-PWY. Remove UM-BBD_pathwayID:abs from parent GO:0018868
GO:0019381 UM-BBD_pathwayID:atr Add as dbref; replace def xref with MetaCyc:PWY-5724. Remove UM-BBD_pathwayID:atr from parent GO:0018873
GO:0019383 UM-BBD_pathwayID:cam Add as dbref; replace def xref with MetaCyc:P601-PWY. Remove UM-BBD_pathwayID:cam from parent GO:0018882
GO:0018983 UM-BBD_pathwayID:car Replace this def xref with dbxref of UM-BBD_pathwayID:pao
GO:0042201 UM-BBD_pathwayID:cpm Remove from def - already correctly a def xref & dbxref on child.
GO:0042199 UM-BBD_pathwayID:cya Remove from def - already correctly a def xref & dbxref on child.
GO:0019497 UM-BBD_pathwayID:ghch Add as narrowMatch dbxref
GO:0019497 UM-BBD_pathwayID:hch Add as narrowMatch dbxref
GO:0042212 UM-BBD_pathwayID:mcr Remove from def - already correctly a dbxref on child.
GO:0019339 UM-BBD_pathwayID:pthn Add as dbref; replace def xref with MetaCyc:PARATHION-DEGRADATION-PWY Remove UM-BBD_pathwayID:pthn from parent GO:0018952
GO:0018970 UM-BBD_pathwayID:tol Remove this def xref and add as def xref and dbxref on GO:0042203 toluene catabolic process
GO:0019119 UM-BBD_reactionID:r0496 Add as narrowMatch dbxref
GO:0019118 UM-BBD_reactionID:r0535 Add as narrowMatch dbxref
GO:0019118 UM-BBD_reactionID:r0536 Add as narrowMatch dbxref
GO:0019119 UM-BBD_reactionID:r0560 Add as narrowMatch dbxref
GO:0018594 UM-BBD_reactionID:r0615 Add as dbxref
GO:0018591 UM-BBD_reactionID:r1023 Add as dbxref
GO:0018755 UM-BBD_reactionID:r1414 Add as dbxref

sjm41 added a commit that referenced this issue Nov 15, 2024
sjm41 added a commit that referenced this issue Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

3 participants