Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using gcps with rio.reproject #339

Closed
oarcher opened this issue May 19, 2021 · 25 comments
Closed

Using gcps with rio.reproject #339

oarcher opened this issue May 19, 2021 · 25 comments
Labels
proposal Idea for a new feature.

Comments

@oarcher
Copy link

oarcher commented May 19, 2021

Hi, this is my first question here:

I'd like to be able to use gcps for reproject, like rasterio is able to do.

My goal is to plot a sentinel-1 image on a map, with geoviews.

My first guess was to use:

rasterize(gv.load_tiff(filename))

But this doesn't works, because the tiff has no transform. However, it has some gcps, and it is correctly mapped in qgis.

I was not able to find any solution to do the reprojection with rioxarray, so I've ended in a pure rasterio solution, converting to rioxarray at the end:

# 100% rasterio
src = rasterio.open(filename)
gcps, src_crs = src.get_gcps()
destination = np.zeros((5000,5000)) 
dest, transform = rasterio.warp.reproject(src.read(1), destination, gcps=gcps, src_crs=src_crs, dst_crs=src_crs)

# to rioxarray
dn = xr.DataArray(dest, dims=['y','x'])
dn = dn.rio.write_crs(src_crs)
dn = dn.rio.write_transform(transform)
dn = dn.assign_coords(x=(transform * (dn.x, dn.y[0]))[0] , y=(transform * (dn.x[0], dn.y))[1]) 

# plot with geoviews
rasterize(gv.Image(dn))

My input image is pretty big (25000 * 15000), so ideally, the reprojection should be done with dask (#119)

@oarcher oarcher added the proposal Idea for a new feature. label May 19, 2021
@snowman2
Copy link
Member

snowman2 commented May 20, 2021

gcps support could definitely be improved.

Using the transform.from_gcps method in open_rasterio might be a path forward on that.

@snowman2 snowman2 changed the title using gcps for rioxarray.raster_array.RasterArray.reproject Using gcps with rio.reproject May 20, 2021
@snowman2
Copy link
Member

@mraspaud also had interest in using gcps with rioxarray.

@mraspaud
Copy link
Contributor

Yes I do, also for sentinel 1 data

@mraspaud
Copy link
Contributor

To be clear, I need the gcps both for reading and writing. The software I'm developing for processing the Sentinel1 SAR data (satpy) needs to be able to write the gcps to the resulting geotiff file. At the moment I we do that by passing the gcps to the rasterio.open call done for writing.

@snowman2
Copy link
Member

@oarcher the image link gives a 404.

@snowman2
Copy link
Member

needs to be able to write the gcps to the resulting geotiff file

Is there a reason you need the gcps versus the transform? I ask because it would be relatively simple to get the transform/crs from the gcps and then use that within rioxarray.

Portential method for rioxarray to get transform from gcps in the file in open_rasterio:

import rasterio

file_path = "s1a-iw-grd-vh-20160121t001321-20160121t001346-009585-00df26-002.tiff"
with rasterio.open(file_path) as rds:
    gcps, crs = rds.get_gcps()
    transform = rasterio.transform.from_gcps(gcps)

Current method for writing transform from gcps:

import rioxarray

rds = ...
gcps = [...]
transform = rasterio.transform.from_gcps(gcps)
rds.rio.write_transform(transform, inplace=True)

@oarcher
Copy link
Author

oarcher commented May 20, 2021

@snowman2 , I've found that using rasterio.transform.from_gcps is not very accurate on big image with many gcps.

For example, we can have a 600 meters errors when applying the transform to the same gcps (for a 10m pixel, and a image 15000*25000). Probably because the transorm is affine, and very small error on coefficients gave large error on large image.

Using bi-linear interpolation from the gcps give more accurate results.

@oarcher
Copy link
Author

oarcher commented May 20, 2021

@oarcher the image link gives a 404.

oops, fixed.

@snowman2
Copy link
Member

That is good to know. I wonder how to accurately generate coordinates for a data array from gcps?

@scottyhq
Copy link
Contributor

Generally gdalwarp with the thin plate spline option (-tps) will give better geocoding results for Sentinel1 images (but still approximate). I don't think -tps is exposed in rasterio... Another workaround is gdalwarp -of VRT tps -srcnodata 0 -dstnodata 0 INPUT.grd INPUT_wgs84.vrt and then open the VRT with rioxarray.

It would be nice though to have some sort of from_gcps keyword or convience function to simplify the process for GCP fit -> rioxarray transform.

@mraspaud
Copy link
Contributor

IIRC, the Sentinel1 documentation recommends bilinear interpolation to expand the GCPs to full lon/lat arrays. We do this in satpy using a daskified version of scipy's interp2d function right now.

However, Sentinel 1 is not the only use case for GCPs and the GDAL documentation says:

The GDAL data model does not imply a transformation mechanism that must be generated from the GCPs … this is left to the application. However 1st to 5th order polynomials are common.

( https://gdal.org/user/raster_data_model.html#gcps )

So in my opinion, rioxarray should expose the GCPs but leave the interpolation scheme to the application. How to do the interpolation (what coordinate system, which order, which interpolation) is highly dependent both on the data at hand and the application the user has, so we would be opening a can of worms here if this was to be implemented in rioxarray. Just my 2 cents.

@snowman2
Copy link
Member

Have y'all tried WarpedVRT for resampling datasets with gcps?

from rasterio.enums import Resampling
from rasterio.vrt import WarpedVRT
import rasterio

with rasterio.open("m_3907617_ne_18_1_20130924.tif") as src:
    with WarpedVRT(src, crs=src.crs, resampling=Resampling.bilinear) as vrt:
        rds = rioxarray.open_rasterio(vrt)

@oarcher this should enable loading in the image lazily with dask. Here is an example: ref.

@snowman2
Copy link
Member

Nevermind. Seems like WarpedVRT doesn't support gcps. Might be worthwhile to open up an issue with rasterio about WarpedVRT and gcps/tps suppport. Then, we can update rioxarray to the changes made by rasterio.

@mraspaud
Copy link
Contributor

@snowman2 would it be possible then to also get access to untransformed gcps from rioxarray?

@snowman2
Copy link
Member

snowman2 commented May 25, 2021

would it be possible then to also get access to untransformed gcps from rioxarray?

Yes, that will also be something I think would be important. Changes that will likely be needed:

  • rio.write_gcps() - write an array of GCPS to the grid_mapping coordinate attrs. I am thinking json would be the best way to go.
  • rio.get_gcps() - read GCPS from json in the grid_mapping coordinate attrs and convert to rasterio GCP objects.
  • rio.reproject() - pass in GCPS automatically where applicable.
  • open_rasterio() - write GCPS and don't generate coordinates or transform.

@mraspaud
Copy link
Contributor

If I am not mistaken, GCPs are directly linked to given lines and columns in the raster, and as such are a subset of a full array of say lons and lats. While geojson would certainly work, I'm wondering if it would be an idea to express the GCPs as lons, lats, and z coordinates of the original raster array somehow? In turn they would have coordinates that are a subset of lines and columns. Or is that too complex?

@snowman2
Copy link
Member

I'm wondering if it would be an idea to express the GCPs as lons, lats, and z coordinates of the original raster array somehow? In turn they would have coordinates that are a subset of lines and columns. Or is that too complex?

What benefit do you see storing GCPs that way?

Original GCPs:

[GroundControlPoint(row=0.0, col=0.0, x=-1.3712442422120068, y=48.818122749035496, z=1.9444833537563682, id='1', info=''), GroundControlPoint(...]

Coordinate Structure

Dimensions: ( gcp: ...)
Coordinates:
  gcp_x (gcp): ...
  gcp_y (gcp): ...
  gcp_z (gcp): ...
  gcp_row (gcp): ...
  gcp_col (gcp): ...
  gcp_id (gcp): ...
  gcp_info (gcp): ...

JSON Structure

Grid mapping coordinate:

Attributes:
    crs_wkt: ....
    gcps: '[{"row": 0.0, "col": 0.0, "x": -1.3712442422120068, "y": 48.818122749035496, "z": 1.9444833537563682, "id": "1","info": ""}, {...]'

@scottyhq
Copy link
Contributor

Might be worthwhile to open up an issue with rasterio about WarpedVRT and gcps/tps suppport. Then, we can update rioxarray to the changes made by rasterio.

I can open up an issue, should have done it while back while exploring this very use case ;)

I realize GCPs show up for all kinds of datasets, but I wanted to loop in @alexamici and @aurghs who have been putting thought into about how to best represent GCPs in xarray for sentinel-1: see https://github.com/bopen/xarray-sentinel#open-gcp-dataset and bopen/xarray-sentinel#54

@mraspaud
Copy link
Contributor

What benefit do you see storing GCPs that way?

I was thinking making is more xarray-ish. Indeed the geojson is more compact, but I was thinking the gcps are really an auxiliary coordinate. However, I'm not even sure xarray would let me do that now...

@snowman2
Copy link
Member

snowman2 commented Jun 3, 2021

Related: rasterio/rasterio#2193

@snowman2
Copy link
Member

snowman2 commented Jun 3, 2021

With #351:

from rasterio.enums import Resampling
from rasterio.vrt import WarpedVRT
import rasterio
import rioxarray

with rasterio.open("s1a-iw-grd-vh-20160121t001321-20160121t001346-009585-00df26-002.tiff") as src:
    gcps, crs = rds.get_gcps()
    with WarpedVRT(src, src_crs=crs, resampling=Resampling.bilinear) as vrt:
        rds = rioxarray.open_rasterio(vrt)

Once this is added to rasterio:

with rasterio.open("s1a-iw-grd-vh-20160121t001321-20160121t001346-009585-00df26-002.tiff") as src:
    with WarpedVRT(src, resampling=Resampling.bilinear) as vrt:
        rds = rioxarray.open_rasterio(vrt)

# specify dimensions
with rasterio.open("s1a-iw-grd-vh-20160121t001321-20160121t001346-009585-00df26-002.tiff") as src:
    with WarpedVRT(src, width=5000, height=5000, resampling=Resampling.bilinear) as vrt:
        rds = rioxarray.open_rasterio(vrt)

@snowman2
Copy link
Member

snowman2 commented Jun 3, 2021

This mapbox issue ref will enable the TPS option gdal docs:

SRC_METHOD: GCP_TPS

@snowman2
Copy link
Member

snowman2 commented Jul 1, 2021

@mraspaud did you have any more thoughts on how to store the GCPS into an xarray object?

@snowman2
Copy link
Member

snowman2 commented Jul 6, 2021

@oarcher see #370

@snowman2
Copy link
Member

snowman2 commented Jul 7, 2021

With #370, considering this issue addressed for now. GCP storage discussion moved to #376

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Idea for a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants