Use ContextManager to support multi S3 object access #626
Unanswered
emmanuelmathot
asked this question in
Q&A
Replies: 1 comment
-
@emmanuelmathot in theory you could pass a with rasterio.Env(
session=AWSSession(
aws_access_key_id="MyDevseedId",
aws_secret_access_key="MyDevseedKey",
)
): so you could make a custom rio-tiler/rio_tiler/io/stac.py Lines 280 to 313 in 584ecdb def _get_asset_info(self, asset: str) -> AssetInfo:
"""Validate asset names and return asset's url.
Args:
asset (str): STAC asset name.
Returns:
str: STAC asset href.
"""
if asset not in self.assets:
raise InvalidAssetName(f"{asset} is not valid")
asset_info = self.item.assets[asset]
extras = asset_info.extra_fields
info = AssetInfo(
url=asset_info.get_absolute_href(),
metadata=extras,
env={}
)
if head := extras.get("file:header_size"):
info["env"].update({"GDAL_INGESTED_BYTES_AT_OPEN": head})
if bands := extras.get("raster:bands"):
stats = [
(b["statistics"]["minimum"], b["statistics"]["maximum"])
for b in bands
if {"minimum", "maximum"}.issubset(b.get("statistics", {}))
]
if len(stats) == len(bands):
info["dataset_statistics"] = stats
if extras.get("storage:requester_pays", None):
info["env"].update({"AWS_REQUEST_PAYER":"requester"})
return info |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We have use cases where we would need to access data on multiple S3 object storage. In the past, I already added the possibility to configure a custom S3 endpoint for STAC (#394) but now we would need to configure it dynamically according to the S3 URL for every asset.
Typically, you could have a STAC Item with multiple assets stored in different S3 provider
Each asset would have a different S3 configuration to access the image properly by setting the proper env variables for GDAL vsi s3 driver. One option would be to have a set of URL patterns with the specificaccess config. For instance:
We could also leverage the option set using the storage STAC extension.
I saw that rio-tiler is using contextlib and I was wondering if there is a good entry point to setup dynamically the env variables for GDAL at low level to be able to apply the config not only STAC assets
rio-tiler/rio_tiler/io/base.py
Line 497 in 584ecdb
but also to any dataset url
rio-tiler/rio_tiler/io/rasterio.py
Line 95 in 584ecdb
Please let me know if this is the right path for such a functionalities and if so let me know how to inject env var for GDAL in this context.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions