[WIP] Support plate loading for varying well sizes (ref #240) #241

jluethi · 2023-01-19T10:42:21Z

Addresses #240, see there for a description of the issue

This PR adds a second code path to the get_pyramid_lazy function. Its goal is to load the metadata of each well to be able to check the pyramid sizes. It then handles varying well sizes by padding every well to the max well size.

I kept the old code path in for the moment and added a loading parameter to the get_pyramid_lazy where the user can specify "new" or "default" to make it easier to compare the two workflows. It's set to "new" at the moment, so this PR follows the new behavior by default for testing.

What this PR should achieve:

Handle wells with different xy dimensions
Handle wells with different Z levels
Adds some padding between wells

Handles, but not optimal:

It handles wells with varying channel numbers. That is not really an intended behavior and may be a bit confusing, because it does not know which channel is which and just pads them together if the channel numbers vary. That was a side-effect of writing the axis handling very general, i.e. not making assumptions on the number of dimensions our images would have (besides x & y being the last 2).

Downsides

The initial loading of a plate gets much slower with this PR. It introduces an overhead of ~1s per well in my setup. As such, if the assumptions hold that all wells have the same dimensions, this PR is a downgrade and thus should not be the default way to load plates.
2 thoughts on this:

Can we provide alternate loading modes and expose them to the user somehow e.g. in napari? Any idea how we'd do this?
This use case would argue for an (optional) consolidation of metadata at the plate level to allow for faster loading, see also discussions here: consolidate Plate to preview image ngff#141 If that were possible, the performance hit may become negligible. It should be easy to refactor this PR to load from a different place, as all the well spec loading takes place in get_plate_well_specs, which is only called once.
What this part would require is access to consolidate info about the image pyramid shapes.

for more information, see https://pre-commit.ci

codecov · 2023-01-19T10:47:04Z

Codecov Report

Base: 84.46% // Head: 82.94% // Decreases project coverage by -1.52% ⚠️

Coverage data is based on head (20c35e6) compared to base (db0f5dc).
Patch coverage: 72.72% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #241      +/-   ##
==========================================
- Coverage   84.46%   82.94%   -1.52%     
==========================================
  Files          13       13              
  Lines        1474     1548      +74     
==========================================
+ Hits         1245     1284      +39     
- Misses        229      264      +35

Impacted Files	Coverage Δ
ome_zarr/reader.py	`80.61% <72.72%> (-5.44%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

will-moore · 2023-01-19T11:37:04Z

Thanks for the PR: not having looked at the code yet, but one thought on how to choose the optimal loading mode...

You could imagine an optional flag in the plate.zarr/.zattrs to indicate "well_dimensions_not_equal": true, and if you find this is true then you use the new loading.
This wouldn't be a breaking change, since the current assumption is that all well dimensions are equal.
cc @sbesson @chris-allan @melissalinkert

jluethi · 2023-01-23T08:27:02Z

Thanks, great idea @will-moore on the "well_dimensions_not_equal": true attribute. If the overall approach is something you'd consider in that way, happy to implement that and clean things up a bit more.

Any idea why I would have tests failing on python 3.8 and 3.9 for parts that are not related to code changes I've made? Do you also see this in other places or should I dig through those tests to understand if there is any dependency (i.e. them using the plate writer somewhere?)

sbesson · 2023-01-23T09:59:37Z

From my side, there is definitely an argument for consolidating some metadata at the top-level especially for complex structures like plates. This echoes what was done for the wells field for instance as a lot of this data could be theoretically recomputed on-the-fly at the expense of several attribute reads.

You could imagine an optional flag in the plate.zarr/.zattrs to indicate "well_dimensions_not_equal": true, and if you find this is true then you use the new loading.

My primary issue is that this name is highly unspecific. Which dimensions are we talking about: all of the dimensions of each multiscale image? XY(Z) only? the channels? the number of fields of view per well?

jluethi · 2023-01-23T10:45:36Z

From my side, there is definitely an argument for consolidating some metadata at the top-level especially for complex structures like plates. This echoes what was done for the wells field for instance as a lot of this data could be theoretically recomputed on-the-fly at the expense of several attribute reads.

Good point. What I currently see is that if we can consolidate the pyramid shape informations on the plate level for each well, then I would not need to loop through all the wells and this implementation should be about as fast as the original one, just more general (i.e. handles the case when any of the dimensions aren't equal by padding).

My primary issue is that this name is highly unspecific. Which dimensions are we talking about: all of the dimensions of each multiscale image? XY(Z) only? the channels? the number of fields of view per well?

Fair point. I'd see this as transitionary until we have the pyramid shape metadata at the plate level and thus this could become default loading behavior. I actually implement the checks for all the dimensions of an image (tczyx). If any well dimension is potentially unequal, one needs to load this metadata beforehand to figure it out. The expensive part is loading the additional metadata, then one can address any dimension that varies with padding. As such, it is actually a somewhat general solution to this.

It doesn't address loading multi-field of view, as this also isn't a part of the current implementation for other reasons (and we switched to using single images for all field of views combined, see discussion here: ome/ngff#137).

imagesc-bot · 2023-03-16T23:12:30Z

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/napari-napari-ome-zarr-plugin-is-not-registered/78482/16

imagesc-bot · 2023-03-17T09:01:09Z

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/napari-napari-ome-zarr-plugin-is-not-registered/78482/18

joshmoore · 2023-05-15T11:27:40Z

@jluethi : What's your plan with this PR? I moved it to milestone 0.8.0 to signify that it wasn't slotted for 0.7.0 but I don't want to put promises into your mouth :)

jluethi · 2023-05-16T11:13:22Z

Hey @joshmoore
For this to really work well, we'd need to aggregate some metadata on the sizes of individual wells on the plate level. I now tested it with a 384 well plate with irregularly shaped wells and it takes ~15 minutes (in this rough implementation) to collect all the metadata. This clearly isn't an acceptable trade-off. Not sure yet when I'll get to pushing the idea of aggregated metadata on the plate level.

It does actually load the images though, instead of just crashing during loading. I see two options of how a cleaned-up version could actually be proposed to be merged: Either via a environment env flag set or with an optional flag set at the plate level.

What's the rough 0.8.0 timeline anyway?

joshmoore · 2023-05-16T11:21:23Z

There's no hard timeline for 0.8.0. It's more of on a need basis. But the metadata aggregation could require a spec change, no?

jluethi · 2023-05-16T11:23:46Z

But the metadata aggregation could require a spec change, no?

Yes, I assume that would require a spec change. So I think a more performant version of this would only come after such a spec change.
The current, slow loading may still be preferable to just failing, so I could see a point for making this an optional path that can be used, if there's interest for it. And maybe there's a bit of optimization room to make it not quite as slow, not sure.

will-moore · 2023-05-16T11:57:07Z

Seems like an environment flag is the easiest option (it doesn't need to be added to the plate data itself - good for old plates). Also it won't clash or be confused with future spec changes.

jluethi and others added 5 commits January 18, 2023 16:09

[BROKEN] Initial refactor work to load wells with varying shapes

8b312a9

Add padding of wells to max size

6d08f1d

Reestablish plate layout

e9cc3a4

Cleanup and add back old code path as optional way

eba8622

[pre-commit.ci] auto fixes from pre-commit.com hooks

20c35e6

for more information, see https://pre-commit.ci

jluethi mentioned this pull request Jan 20, 2023

OME-Zarr visualization with varying levels of Z stacks fractal-analytics-platform/fractal-tasks-core#52

Closed

will-moore mentioned this pull request Feb 17, 2023

FOV names and array names are hard-coded for the HCS dataset reader #248

Closed

jluethi mentioned this pull request Mar 21, 2023

Collections Specification ome/ngff#31

Open

joshmoore modified the milestone: 0.8.0 May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Support plate loading for varying well sizes (ref #240) #241

[WIP] Support plate loading for varying well sizes (ref #240) #241

jluethi commented Jan 19, 2023

codecov bot commented Jan 19, 2023

will-moore commented Jan 19, 2023

jluethi commented Jan 23, 2023

sbesson commented Jan 23, 2023

jluethi commented Jan 23, 2023

imagesc-bot commented Mar 16, 2023

imagesc-bot commented Mar 17, 2023

joshmoore commented May 15, 2023

jluethi commented May 16, 2023

joshmoore commented May 16, 2023

jluethi commented May 16, 2023

will-moore commented May 16, 2023

[WIP] Support plate loading for varying well sizes (ref #240) #241

Are you sure you want to change the base?

[WIP] Support plate loading for varying well sizes (ref #240) #241

Conversation

jluethi commented Jan 19, 2023

What this PR should achieve:

Handles, but not optimal:

Downsides

codecov bot commented Jan 19, 2023

Codecov Report

will-moore commented Jan 19, 2023

jluethi commented Jan 23, 2023

sbesson commented Jan 23, 2023

jluethi commented Jan 23, 2023

imagesc-bot commented Mar 16, 2023

imagesc-bot commented Mar 17, 2023

joshmoore commented May 15, 2023

jluethi commented May 16, 2023

joshmoore commented May 16, 2023

jluethi commented May 16, 2023

will-moore commented May 16, 2023