Skip to content

Conversation

GregHydeDartmouth
Copy link

Added a benchmark for OrphaNet containing rare-disease to gene edges. This includes a parser script to generate the data file from either the raw xml (en_product6.xml) or from http://www.orphadata.org/data/xml/en_product6.xml. We added a library to the setup.py and requirements.txt (xmltodict) to parse xml into python objects for ease. data.tsv includes the output benchmark edges that maps to our templates.

GregHydeDartmouth and others added 2 commits February 15, 2023 12:43
…to be used for benchmarking. Also updated requirements.txt and setup.py to include xmltodict library because XML is a pain and I would rather work with json :)
Copy link
Collaborator

@maximusunc maximusunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Greg, if you're still interested in getting this benchmark merged in, could you please fix the current merge conflicts with the main branch? I'd also like to request that you remove the additional python requirements from requirements.txt and setup.py. The reason behind that is that any data generation scripts are specific to those datasets and someone running the benchmarks doesn't need xmltodict in order to run the tests. We could maybe make a requirements-dev.txt that contained any requirements that individual dataset generation scripts need?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants