-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Problem Statement
Float-to-integer quantization with scale/offset transformation is a common pattern for reducing storage size while maintaining precision in Earth observation and climate data. However, there is currently no stable, standardized Zarr codec for this operation, leading to ecosystem fragmentation and interoperability issues.
Current Fragmented State
Multiple incompatible implementations exist:
- Zarr V2/V3:
scale_factorandadd_offsetattributes (inherited from CF/netCDF conventions, handled implicitly by xarray) - Zarr V3:
numcodecs.fixedscaleoffsetcodec (implemented in Python numcodecs, but not consistently available in JavaScript libraries like zarrita.js or numcodecs.js)
Real-World Impact
This fragmentation prevents data from being accessible across the ecosystem. During development of the EOPF Sentinel Zarr Explorer for ESA, we discovered that Sentinel-2 data encoded with scale/offset at native resolutions (10m, 20m, 60m) cannot be visualized in web contexts using OpenLayers because the codec is unavailable in JavaScript implementations.
Request
The Zarr community should converge on a stable, registered extension for fixedscaleoffset transformation that:
- Works consistently across Zarr V2 and V3 and is implementation independent
- Has reference implementations in both Python and JavaScript
- Defines clear semantics for the transformation parameters (offset, scale, dtype, astype). codec? filter?
- Follows the extension registration process outlined in this repository
Related Discussions
- numcodecs.js issue: Support for fixedscaleoffset manzt/numcodecs.js#49
- zarrita.js PR (stalled): Implement fixedscaleoffset codec manzt/zarrita.js#312
- EOPF data-model conversion issue: FixedScaleOffset codec not preserved during zarr v2 to v3 conversion EOPF-Explorer/data-model#106
cc @manzt @d-v-b @ahocevar @vincentsarago @vdumoul @maxrjones @abarciauskas-bgse @j08lue