Synonym sync: acronym case exception #671
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Addressed an issue where we were preferring the Mondo capitalization even though it was incorrect, i.e. cases where the synonym was an acronym and capitalized in the source, but was not capitalized in Mondo.
Pre-merge checklist
Documentation
Was the documentation added/updated under
docs/
?QC
Was the full pipeline run before submitting this PR using
sh run.sh make build-mondo-ingest
on this branch (afterdocker pull obolibrary/odkfull:dev
), and no errors occurred?Build:
New Packages
Were any new Python packages added?
Were any other non-Python packages added?
PR Review and Conversations Resolved
Has the PR been sufficiently reviewed by at least 1 team member of the Mondo Technical team and all threads resolved?
Additional information
Context
@twhetzel and I discussed this at our last 1:1. Sabrina had done some recent curation and noticed that sometimes we would actually prefer the source's capitalization rather than Mondo's. I looked at the google sheet, at all of the values where
Use Source Case (Curator Review)
==Source
, and saw that these were all acronyms. They were all cases where the source was all caps and Mondo was not.Results
I ran a before/after, using DO as my test case, and looked at the differences in the outputs. There were differences in the
doid.synonyms.confirmed.robot.tsv
anddoid.synonyms.updated.robot.tsv
. I examined and the outputs are as I expected. Here are the diffs (FYI I added column headers at the top):