Optimize idxmin, idxmax with dask #9800

dcherian · 2024-11-19T23:05:17Z

Closes idxmin / idxmax is not parallel friendly #9425
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

cc @phofl here we need to index a numpy array with a dask array (commonly a much larger array) in a sane manner.

We now preserve chunksizes for

import numpy as np
import xarray as xr

# create some dummy data and chunk
x, y, t = 1000, 1000, 57
rang = np.arange(t*x*y)
da = xr.DataArray(rang.reshape(t, x, y), coords={'time':range(t), 'x': range(x), 'y':range(y)})
da = da.chunk(dict(time=-1, x=256, y=256))
da.idxmin('time')

After

Before

Closes pydata#9425

xarray/tests/test_dataarray.py

Co-authored-by: Michael Niklas <[email protected]>

for more information, see https://pre-commit.ci

dcherian added 3 commits November 19, 2024 16:04

Optimize idxmin, idxmax with dask

eb35563

Closes pydata#9425

use map_blocks instead

271dbec

small edits

33ae848

dcherian added the topic-chunked-arrays Managing different chunked backends, e.g. dask label Nov 20, 2024

dcherian marked this pull request as ready for review November 20, 2024 03:52

dcherian added 2 commits November 19, 2024 20:53

fix typing

28aeea0

try again

e2547f1

headtr1ck reviewed Nov 20, 2024

View reviewed changes

xarray/tests/test_dataarray.py Outdated Show resolved Hide resolved

dcherian marked this pull request as draft November 20, 2024 17:39

Migrate to DaskIndexingAdapter

9152b6f

dcherian force-pushed the vindex-idxminmax branch from a4ba2bc to 9152b6f Compare November 21, 2024 03:43

dcherian and others added 2 commits November 20, 2024 20:43

Update xarray/tests/test_dataarray.py

2e68d12

Co-authored-by: Michael Niklas <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

4431b2e

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize idxmin, idxmax with dask #9800

Optimize idxmin, idxmax with dask #9800

dcherian commented Nov 19, 2024 •

edited

Loading

Optimize idxmin, idxmax with dask #9800

Are you sure you want to change the base?

Optimize idxmin, idxmax with dask #9800

Conversation

dcherian commented Nov 19, 2024 • edited Loading

dcherian commented Nov 19, 2024 •

edited

Loading