-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bioformats2raw metadata support #174
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #174 +/- ##
==========================================
- Coverage 84.79% 83.90% -0.89%
==========================================
Files 13 14 +1
Lines 1473 1591 +118
==========================================
+ Hits 1249 1335 +86
- Misses 224 256 +32
☔ View full report in Codecov by Sentry. |
Pointing ome_zarr at a non-root node will now walk-up the hierarchy until the root node is discovered.
This allows the definition of node specs outside of this repository. Primary driver is for experimenting with specs which depend on ome_types, etc. These may eventually be pulled into the mainline.
See https://github.com/ome/ome-zarr-metadata/releases/tag/0.1.0 for an example of an entrypoint. After creating a fake .zgroup under the output of
Notice:
|
found.append(Well(self)) | ||
self.specs.append(found[-1]) | ||
|
||
for key, value in entrypoints.get_group_named("ome_zarr.spec").items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? I see no file named "ome_zarr.spec"
and I've not used entrypoints
before. I don't get much idea from https://github.com/takluyver/entrypoints as to what it does, only that "This package is in maintenance-only mode. New code should use the importlib.metadata module
".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do?
It's essentially a namespace for the particular entrypoint. It's used at
https://github.com/ome/ome-zarr-metadata/blob/main/setup.cfg#L81
@will-moore, you approve of getting the siblings (assuming napari can be fixed)? If so, I'll look why it's only for the one row. Also, where do you get the labels for this plate? Is this the one you had a script for? |
@will-moore, I've reverted the upwards parsing. It seemed like a good strategy but there are currently too many edge cases. I don't have labels on plates for testing at the moment, but I think with the current state along with ome/ome-zarr-metadata@08e12f7#diff-0bb17e0ecb4ac83835ee3800a1af71a12f644b0ce782c623ba97f8917916250eR54 all the following should be true:
The only other change I can think of is if you pass a group that previously did nothing, it will likely try to load the contents. |
In discussing today with @dgault, @sbesson, @jburel and @melissalinkert, there was a case made for at least adding the flag ( |
To improve the codecov results, see https://github.com/zarr-developers/numcodecs/pull/300/files#diff-bc37cd9860eec1facdc18a47798e8a1a2c0ef5dabd999deee049de4a48a5d35fR1 for an option of in-repo testing of entrypoints. |
@joshmoore To help address the "don't have labels on plates for testing", I created https://gist.github.com/will-moore/0f4cb6b1fdd60a255fcbb956a54a645e which adds labels to a plate (currently assumes images axes are I don't know if I'm missing something, maybe not using |
fd64b1d
to
9ad2e5a
Compare
see a quick use of this functionality: |
Migrated the bf2raw implementation from https://github.com/ome/ome-zarr-metadata :
|
Initially prepared as a plugin in https://github.com/ome/ome-zarr-metadata this is being moved to the main repository since it is a part of the main spec.
_logger.info("found %s", series) | ||
subnode = node.add(series) | ||
if subnode: | ||
subnode.metadata["ome-xml:index"] = idx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the biggest lingering question I have since this basically becomes public API e.g. used in napari-ome-zarr. Thoughts welcome. cc: @will-moore @sbesson et al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder where this public API is documented?
This new module should be added to https://github.com/ome/ome-zarr-py/blob/master/docs/source/api.rst but I think these changes in parsing also need to be described elsewhere, maybe at https://ome-zarr.readthedocs.io/en/stable/python.html#reading-ome-ngff-images
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big 👍 but first do you think what's there is usable, understandable, extensible, etc.?
subnode = node.add(series) | ||
if subnode: | ||
subnode.metadata["ome-xml:index"] = idx | ||
subnode.metadata["ome-xml:image"] = image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add the following here, then you don't need ome/napari-ome-zarr#47
subnode.metadata["name"] = image.name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then plugins would/could overwrite the main metadata, right? Which means it becomes a question of the order that the plugins run in. Would:
subnode.plugins["bioformats2raw"].metadata["name"]
be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As probably transpired from #140, I am also unclear on the contract and expectation for the node.metadata
field. My impression was that this API has been largely driven by the napari use-case.Reading this, it looks like another goal is to introduce some form of metadata consolidation so that a consumer does not have to go up and down the hierarchy to assemble pieces of metadata, is that correct?
If so, I agree what is missing is some form of namespace to differentiate the metadata injected at different levels in the hierarchy. Taking advantage of the fact this is a Python dictonary, another proposal would be to group all the metadata under a top-level key e.g. subnode.metadata["bioformats2raw"] = {"index": idx, "image": image", "name", image.name}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.Reading this, it looks like another goal is to introduce some form of metadata consolidation so that a consumer does not have to go up and down the hierarchy to assemble pieces of metadata, is that correct?
Maybe. But now that we have the json-schema, I could also see just attaching objects at the right node level to prevent re-reading the file. That's essentially what would happen if the ome-types OME
object becomes part of the bioformats2raw plugin's public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind I might expect something like Josh's original idea or subnode.metadata["ome.xml"] = {"index": idx, "image": image", "name", image.name}
since this is metadata coming from the ome.xml
?
The reason I like subnode.metadata["name"] = image.name
is that any consumer doesn't have to know about bioformats2raw
or xml etc to have the correct image name.
But if it's just napari-ome-zarr
and we already have ome/napari-ome-zarr#47 then 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that any consumer doesn't have to know about bioformats2raw or xml etc to have the correct image name.
This possibly gets back to the question of whether or not this bf2raw parsing is core or not. If it is, then you're probably right that we should just encode the rules for what "wins" directly. If this is a plugin, though, then how would we handle misbehavior in the plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a plugin here? If it's a plugin, then it's not installed by default and you get to choose if you want to add it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This possibly gets back to the question of whether or not this bf2raw parsing is core or not
Trying to answer this, this is where I would draw the line:
- if a specification is published in https://ngff.openmicroscopy.org/, I consider it as
core
(even iftransitional
) and my expectation is that this library should support it - a contrario, any specification not defined in the upstream specification should be seen as an extension and should be handled by some plugin/third-party mechanism. This might drive the discussions on the extensibility of this library which is a good thing.
Should we also discuss the name of the module itself? |
We added an |
Other than the string |
Ah - yes, too late to change the |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/intermission-ome-ngff-0-4-1-bioformats2raw-0-5-0-et-al/72214/1 |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/saving-volumetric-data-with-voxel-size-colormap-annotations/85537/24 |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/constructing-zarr-for-napari-ome-zarr/65398/4 |
Part of the investigation of metadata in ome/ngff#104. This "implicit" group is the cheapest form of collection imaginable.
Currently, only groups within the given group (and not arrays or explicit files) will be further parsed.