Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new grpc endpoint to get genome id from accession id #63

Merged
merged 9 commits into from
Dec 18, 2024
Merged

Conversation

jyobhai
Copy link
Contributor

@jyobhai jyobhai commented Dec 9, 2024

Description

To resolve rapid urls with GCA accession, we need an endpoint to fetch this detail from grpc. After consulting with Bilal, we decided to use GenomeBySpecificKeywordRequest. This grpc endpoint returns an array of results from which we are picking up genome id and the release version. Release version is used in resolver to sort and find the latest genome id

Related JIRA Issue(s)

https://www.ebi.ac.uk/panda/jira/browse/ENSWBSITES-2830

Review App URL(s)

NA

Knowledge Base

NA

Checklist

  • Black formatting
  • Tests

Dependencies

No

@@ -148,3 +149,12 @@ def get_dataset_attributes(
genome_uuid=genome_uuid, dataset_type=dataset_type
)
return self.stub.GetAttributesValuesByUUID(dataset_attributes)

def get_genome_by_specific_keyword(self, assembly_accession_id: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that you are looking up genome id specifically by assembly accession id. If so, you probably shouldn't say by_specific_keyword in the function name, and in the class name in models/genome.py.

Both thoas and the grpc service allow various fields to be used as a "keyword"; but here you are only using a specific one.

@@ -238,3 +239,19 @@ async def get_genome_dataset_attributes(
except Exception as ex:
logging.error(ex)
return response_error_handler({"status": 500})

@router.get("/genomeid/{assembly_accession_id}", name="genome_id")
Copy link
Contributor

@azangru azangru Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look like a great pathname. Something like /assembly/{assembly_accession_id}/genome_id/ would be closer to what is happening here. We may want to discuss this collectively.

@@ -238,3 +239,21 @@ async def get_genome_dataset_attributes(
except Exception as ex:
logging.error(ex)
return response_error_handler({"status": 500})

@router.get("/genome")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we say that since genome in other routes has a different shape, this route should be called something other than genome?

Copy link
Contributor

@bilalebi bilalebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I like how you handeled fetching the latest genome without much code changes


@router.get("/genomeid")
async def get_genome_by_keyword(request: Request, assembly_accession_id: str):
if (assembly_accession_id == ""):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if assembly accession ID doesn’t exist in the response?

Copy link
Contributor

@veidenberg veidenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to update the OpenAPI spec (and maybe even unit tests) to account for the new endpoint but code looks good.

if (genome_by_keyword_object.release_version > latest_genome_by_keyword_object.release_version):
latest_genome_by_keyword_object = genome_by_keyword_object
if (latest_genome_by_keyword_object.genome_uuid):
return latest_genome_by_keyword_object
Copy link
Contributor

@veidenberg veidenberg Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it return a response (e.g. responses.JSONResponse) instead of a pydantic model instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was doing it at the resolver end. Makes sense to do it here. Will update.

@@ -283,10 +283,41 @@ paths:
example: 'Could not find details for genome 0cbc227a-7179-43c2-8061-a50d94423c4d and dataset homologies.'
'500':
$ref: '#/components/responses/500InternalServerError'
/api/metadata/genomeid/:
get:
summary: Returns a list of genome uuids and its release version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code returns a single GenomeIDObject, OpenAPI spec expects a list of objects. Which one is correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true. It sorts and returns one. Good point. Thanks

@jyobhai jyobhai merged commit d4a0be1 into main Dec 18, 2024
@jyobhai jyobhai deleted the new-endpoint branch December 18, 2024 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants