-
Notifications
You must be signed in to change notification settings - Fork 57
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chunking support #1783
Comments
Thanks for opening this issue. There has been some work adding support for the zarr storage format within ASDF. This is implemented via an extension: https://github.com/asdf-format/asdf-zarr It's a new package so please let me know if it's something you plan to use "in production" (so we can give it another review, also feel free to give it a try and open issues if you find anything). The extension offers a few options:
The use of zarr also opens up a second place where compression can be controlled (which can get a bit confusing). |
@braingram Nice! We are currently discussing storage formats, and both ASDF and Zarr are contenders that have various advantages and disadvantages. On the surface, using Zarr chunking with ASDF single-file storage seems like an excellent choice. I will have a look. |
When a large ndarray is stored as binary block with compression, then the (beginning of) the whole block needs to be read and decompressed even when only a small subarray is read. "Chunking" remedies this; instead of storing an ndarray as a single binary block, it is stored as a set of smaller blocks that are compressed and stored independently.
Are there plans to support this? Can this be implemented as extension?
One simple approach would be to introduce a new yaml tag
core/chunked-ndarray
that consists of a yaml map that maps offsets to ndarrays, for exampleHas there been any work in this direction?
The text was updated successfully, but these errors were encountered: