Add Xarray sub-package #1013

vincentsarago · 2024-10-29T21:18:25Z

overtake #1007

To Do

reviews
add Docs

vincentsarago · 2024-10-29T21:20:29Z

src/titiler/xarray/titiler/xarray/main.py

+    md.router,
+    prefix="/md",
+    tags=["Multi Dimensional"],
+)


TODO: remove this before merging

titiler.xarray should be seen as a plug-in to titiler and not as an application itself. We will add example on how to build application using the endpoint factory

vincentsarago · 2024-10-29T21:21:23Z

src/titiler/xarray/titiler/xarray/io.py

+    if "x" not in da.dims and "y" not in da.dims:
+        try:
+            latitude_var_name = next(
+                x for x in ["lat", "latitude", "LAT", "LATITUDE", "Lat"] if x in da.dims


do we need to support other variable name?

vincentsarago · 2024-10-29T21:23:13Z

src/titiler/xarray/titiler/xarray/dependencies.py

+
+
+@dataclass(init=False)
+class CompatXarrayParams(DefaultDependency):


this is not directly used in titiler.xarray but could be in a Tiler that would want to support both GDAL/Xarray dataset

vincentsarago · 2024-10-29T21:23:39Z

src/titiler/xarray/titiler/xarray/factory.py

+
+
+@define(kw_only=True)
+class TilerFactory(BaseTilerFactory):


By sub-classing titiler.core.factory.TilerFactory we avoid re-writing code

vincentsarago · 2024-10-29T21:24:10Z

src/titiler/xarray/titiler/xarray/factory.py

+
+    # remove some attribute from init
+    img_preview_dependency: Type[DefaultDependency] = field(init=False)
+    add_preview: bool = field(init=False)


we remove those 2 attributes because we don't support /preview endpoints

vincentsarago · 2024-10-29T21:25:19Z

src/titiler/xarray/pyproject.toml

+    "aiohttp",
+    "pandas",
+    "httpx",
+]


todo: update rio-tiler to >=7.1

vincentsarago · 2024-10-29T21:26:46Z

src/titiler/xarray/titiler/xarray/factory.py

+            return Response(content, media_type=media_type)
+
+    # custom /statistics endpoints (remove /statistics - GET)
+    def statistics(self):


☝️ IMO having a full dataset /statistics in a bit dangerous (as for the /preview endpoints) which is why we support only geojson statistics

vincentsarago · 2024-10-29T21:28:15Z

you can try this with

uvicorn titiler.xarray.main:app --port 8080 --reload

maxrjones

Thanks for all your work here, @vincentsarago!

This is a very opinionated take, but I think titiler-xarray would be best off with two separate routes, each with its own set of optional dependencies. The first route would be zarr, which would open Zarr and virtual Zarr datasets using xarray.open_zarr. The second route would be md, which would opening any dataset readable by xarray.open_dataset.

The primary reason I think we should do this is that it would enable us to incentivize virtualizing datasets into zarr, which would lead to much faster tile generation. We could do this by:

Having all query parameters in the zarr route only relevant for open_zarr, simplifying API usage.
Automatically detect virtual datasets, removing the need for the reference parameter.
Lightening the image size for titiler-xarray deployments only using zarr because other readers would not be installed (and eventually obstore and/or icechunk could be used instead of the fsspec dependencies)

This would also simplify non-zarr usage for the following reasons:

Zarr specific parameters (e.g., group, consolidated) would not be included in endpoints in the md route
We could use xarray's automatic backend detection rather than writing our own in titiler/xarray/io.py

I also think isolating Zarr usage would simplify the eventual support of the GeoZarr and multiscales specifications.

src/titiler/xarray/tests/fixtures/pyramid.zarr/.zgroup

vincentsarago · 2024-10-31T09:03:07Z

Thanks @maxrjones 🙏

I see what you're saying. The goal of having a single Reader was to handle all the non-COG dataset so splitting in to two separate reader/set of endpoints would not meat the goal.

This is a very opinionated take, but I think titiler-xarray would be best off with two separate routes, each with its own set of optional dependencies. The first route would be zarr, which would open Zarr and virtual Zarr datasets using xarray.open_zarr. The second route would be md, which would opening any dataset readable by xarray.open_dataset.

We can absolutely use xarray.open_zarr instead of xarray.open_dataset here when reading a zarr

We could use xarray's automatic backend detection rather than writing our own in titiler/xarray/io.py

How so? https://github.com/developmentseed/titiler/pull/1013/files#diff-dd6fab5d1e55a1d860ff8bd2190f145f2574100f734d382cee56c48bd7a7f1f5R43-R49 ?

If I follow your think, it seems we would need a titiler.multidim and a titiler.zarr packages 🤷

What if we make the dependencies optional? I'm going to open a PR on top of this one to try some things

hrodmn

This is great! The concept of creating pyramids in a zarr store was new to me, then I googled around and found @maxrjones's notebook 😆.

It is great to have the io methods standardized here so we can import them in titiler.cmr and other applications.

src/titiler/xarray/tests/test_factory.py

hrodmn · 2024-10-31T15:05:38Z

src/titiler/xarray/titiler/xarray/io.py

+        else:
+            da = da.isel(time=0)
+
+    assert len(da.dims) in [2, 3], "titiler.xarray can only work with 2D or 3D dataset"


Suggested change

assert len(da.dims) in [2, 3], "titiler.xarray can only work with 2D or 3D dataset"

if len(da.dims) in [2, 3]:

raise ValueError("titiler.xarray can only work with 2D or 3D dataset")

I guess you want this to be if not

hrodmn · 2024-10-31T15:12:51Z

src/titiler/xarray/titiler/xarray/io.py

+    if crs == "epsg:4326" and (da.x > 180).any():
+        # Adjust the longitude coordinates to the -180 to 180 range
+        da = da.assign_coords(x=(da.x + 180) % 360 - 180)
+
+        # Sort the dataset by the updated longitude coordinates
+        da = da.sortby(da.x)


I wonder if there are more CRS definitions we would want to apply this fix to. Maybe there is a way to tell if we want to adjust coordinates based on some other CRS properties besides an exact name match.

rasterio doesn't have a CRS.is_geographic method yet (rasterio/rasterio#3218) but once it's available we could check if the CRS is geographic and then run those fixes

Co-authored-by: Henry Rodman <[email protected]>

sketch

13351bb

vincentsarago commented Oct 29, 2024

View reviewed changes

src/titiler/xarray/pyproject.toml

"aiohttp",

"pandas",

"httpx",

]

Copy link

Member Author

vincentsarago Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: update rio-tiler to >=7.1

vincentsarago commented Oct 29, 2024

View reviewed changes

vincentsarago added 2 commits October 30, 2024 13:18

add tests

faaaa3e

add pyramid tests

d9ea7d2

This comment was marked as resolved.

Sign in to view

remove multiscale option

80f3350

vincentsarago marked this pull request as ready for review October 30, 2024 15:05

maxrjones reviewed Oct 30, 2024

View reviewed changes

src/titiler/xarray/tests/fixtures/pyramid.zarr/.zgroup Show resolved Hide resolved

hrodmn reviewed Oct 31, 2024

View reviewed changes

Update src/titiler/xarray/tests/test_factory.py

691eeed

Co-authored-by: Henry Rodman <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Xarray sub-package #1013

Add Xarray sub-package #1013

vincentsarago commented Oct 29, 2024 •

edited

Loading

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024

vincentsarago Oct 29, 2024 •

edited

Loading

vincentsarago commented Oct 29, 2024

This comment was marked as resolved.

maxrjones left a comment

vincentsarago commented Oct 31, 2024

hrodmn left a comment

hrodmn Oct 31, 2024

vincentsarago Oct 31, 2024

hrodmn Oct 31, 2024

vincentsarago Oct 31, 2024



		@dataclass(init=False)
		class CompatXarrayParams(DefaultDependency):

	assert len(da.dims) in [2, 3], "titiler.xarray can only work with 2D or 3D dataset"
	if len(da.dims) in [2, 3]:
	raise ValueError("titiler.xarray can only work with 2D or 3D dataset")

Add Xarray sub-package #1013

Are you sure you want to change the base?

Add Xarray sub-package #1013

Conversation

vincentsarago commented Oct 29, 2024 • edited Loading

To Do

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vincentsarago Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

vincentsarago commented Oct 29, 2024

This comment was marked as resolved.

maxrjones left a comment

Choose a reason for hiding this comment

vincentsarago commented Oct 31, 2024

hrodmn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vincentsarago commented Oct 29, 2024 •

edited

Loading

vincentsarago Oct 29, 2024 •

edited

Loading