-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a prototype of Sample::developmental_stage backfill script #3461
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, a couple comments about approach.
Looking forward, I think that we will want to update the _apply_harmonized_metadata_to_sample
to specifically handle updates vs new samples. This depends on what the science team says is appropriate.
@@ -94,6 +94,10 @@ def __str__(self): | |||
created_at = models.DateTimeField(editable=False, default=timezone.now) | |||
last_modified = models.DateTimeField(default=timezone.now) | |||
|
|||
# Auxiliary field for tracking latest metadata update time. | |||
# Originally added to support Sample::developmental_stage values backfilling. | |||
last_refreshed = models.DateTimeField(auto_now=True, null=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We will probably want
last_refreshed
on Experiment as well, since a sample could belong to more than one experiment. - We probably want to add
last_refresh_failure
as a timestamp on both as well to help with re-running
logger.info(f"Refreshing metadata for a sample {sample.accession_code}") | ||
try: | ||
_, sample_metadata = SraSurveyor.gather_all_metadata(sample.accession_code) | ||
SraSurveyor._apply_harmonized_metadata_to_sample(sample_metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This takes sample as the first argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more consideration here is that after updating the sample we will want to update the cached values on experiment.
ie:
refinebio/foreman/data_refinery_foreman/surveyor/external_source.py
Lines 207 to 211 in 07d3759
# Update our cached values | |
experiment.update_num_samples() | |
experiment.update_sample_metadata_fields() | |
experiment.update_platform_names() | |
experiment.save() |
Issue Number
#3438
Purpose/Implementation Notes
This is a draft/prototype of a Foreman command to use for
Sample::developmental_stage
backfill process. The code is untested and supports SRA source DB only.Types of changes
What types of changes does your code introduce?
Checklist