Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading and writing tiff gcps #460

Merged
merged 15 commits into from
Feb 9, 2022

Conversation

mraspaud
Copy link
Contributor

@mraspaud mraspaud commented Feb 4, 2022

This PR adds support for reading and writing tiff ground control points

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 4, 2022

So, one thing that I don't really know how to handle is that the gcps attribute from rasterio is actually a tuple of (GCPs, CRS), where CRS is the CRS of the GCPs obviously. Now, I see there is another integration test where gcps are used, and they are not in EPGS:4326 (WGS84) (My Sentinel 1 test file has gcps in wgs84).
Problem is that the geojson spec expects coordinates to be in WGS84 and does not allow passing a custom CRS attribute anymore, see here https://datatracker.ietf.org/doc/html/rfc7946#section-4
My approach right now is to set the CRS as a separate attribute called gcp_crs, because I'm imagining it could collide with the spatial_ref coord... what do you think?

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 4, 2022

I'm not understanding why tests are failing... they pass locally and I didn't touch anything there afaics

@snowman2
Copy link
Member

snowman2 commented Feb 4, 2022

My approach right now is to set the CRS as a separate attribute called gcp_crs, because I'm imagining it could collide with the spatial_ref coord... what do you think?

I was actually thinking about writing the GCP attributes in the spatial_ref coord so that it persists better. From what I can tell, when there are GCPs, there is no CRS of the raster. So, it shouldn't conflict. Does this make sense?

@snowman2
Copy link
Member

snowman2 commented Feb 4, 2022

The failures are unrelated:

ERROR test/integration/test_integration__io.py::test_cint16_dtype[None] - Typ...
ERROR test/integration/test_integration__io.py::test_cint16_dtype[complex_int16]
ERROR test/integration/test_integration__io.py::test_cint16_dtype_nodata - Ty...
ERROR test/integration/test_integration__io.py::test_cint16_dtype_masked[None]
ERROR test/integration/test_integration__io.py::test_cint16_dtype_masked[complex_int16]
ERROR test/integration/test_integration__io.py::test_cint16_promote_dtype - T...

Seems some dependency update broke things. I wouldn't worry about those for now.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 4, 2022

I was actually thinking about writing the GCP attributes in the spatial_ref coord so that it persists better. From what I can tell, when there are GCPs, there is no CRS of the raster. So, it shouldn't conflict. Does this make sense?

I think... 😄 I'm just not sure what to do of the x and y coord arrays, they will be totally unrelated to the gcp CRS

@snowman2
Copy link
Member

snowman2 commented Feb 4, 2022

I'm just not sure what to do of the x and y coord arrays, they will be totally unrelated to the gcp CRS

I think it would be a good idea to not load them if in there are GCPs. This can be accomplished by setting parse_coordinates=False inside of open_rasterio if GCPs are found.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 4, 2022

Ok, I'm confused: to implement rio.get_gcps() , I need to get access to the underlying rasterio object, but the XRasterBase class doesn't have access to it, right? so should I have the untouched gcps also as an attribute to the dataarray?

@snowman2
Copy link
Member

snowman2 commented Feb 4, 2022

Ok, I'm confused: to implement rio.get_gcps() , I need to get access to the underlying rasterio object, but the XRasterBase class doesn't have access to it, right? so should I have the untouched gcps also as an attribute to the dataarray?

This is what I am thinking:

  • rio.write_gcps should perform _convert_gcps_to_geojson(gcps) and add it as attributes on the spatial_ref coordinate.
  • rio.get_gcps should perform the inverse of _convert_gcps_to_geojson(gcps) from the attributes on the spatial_ref coordinate and return List[GroundControlPoint] to the user.

Side note: rio._manager.acquire() will give you access to the underlying rasterio dataset reader. However, this is only there right after opening the file and can be easily lost after performing xarray operations.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 4, 2022

Ooooh, ok! I hadn't understood that write_transform wasn't actually writing to disk...

@snowman2
Copy link
Member

snowman2 commented Feb 4, 2022

With #461 merged you should be able to rebase from latest and have more tests passing.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 7, 2022

@snowman2 Ok, I think the writing of gcps with rasterio is now implemented. So that leaves the reprojection part, which I have no real clue how to do...

@snowman2
Copy link
Member

snowman2 commented Feb 7, 2022

So that leaves the reprojection part, which I have no real clue how to do...

It has a gcps kwarg: https://rasterio.readthedocs.io/en/latest/api/rasterio.warp.html#rasterio.warp.reproject

Looks like you can just pass that in here:

rasterio.warp.reproject(

@snowman2
Copy link
Member

snowman2 commented Feb 7, 2022

This may also have to be accounted for ref

@snowman2
Copy link
Member

snowman2 commented Feb 7, 2022

Related: #370

@mraspaud mraspaud marked this pull request as ready for review February 7, 2022 17:40
@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 7, 2022

Thanks a lot for the pointers! I think I got the reprojection to work now too.

@snowman2
Copy link
Member

snowman2 commented Feb 7, 2022

Perfect, I think that the initial logic looks pretty good. The next step is to make the CRS coordinate more generic as spatial_ref is not guaranteed to be the name.

  • Reading: for rio.get_gcps you will need to use rio.grid_mapping instead of spatial_ref to lookup. An example is here. self._obj.coords[self.grid_mapping] instead of self._obj.spatial_ref.
  • Writing: for rio.write_gcps you will need to add the grid_mapping_name kwarg. An example implementation is here.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 8, 2022

Fixed! I see my changes are breaking the linting runs though, should I move some things out to a new module?

rioxarray/rioxarray.py Outdated Show resolved Hide resolved
@snowman2
Copy link
Member

snowman2 commented Feb 8, 2022

should I move some things out to a new module?

Hmmm, it is getting long, but we're on the boundary here. For now, you can adjust the line length in the pylintrc file for now just enough to pass. If you feel motivated to think about how to better organized things, you are welcome to.

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 8, 2022

should I move some things out to a new module?

Hmmm, it is getting long, but we're on the boundary here. For now, you can adjust the line length in the pylintrc file for now just enough to pass. If you feel motivated to think about how to better organized things, you are welcome to.

I refactored a few lines, let's hope it helped

rioxarray/rioxarray.py Outdated Show resolved Hide resolved
rioxarray/rioxarray.py Outdated Show resolved Hide resolved
@snowman2 snowman2 requested a review from justingruca February 8, 2022 20:23
rioxarray/rioxarray.py Outdated Show resolved Hide resolved
@snowman2
Copy link
Member

snowman2 commented Feb 9, 2022

Thanks for your patience and work on this @mraspaud 👍. Would you like to squash your commits or do you mind if I do it on merge?

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 9, 2022

You can do it on the merge, no problem. But I see a bunch of tests failing?

@mraspaud
Copy link
Contributor Author

mraspaud commented Feb 9, 2022

Strange thing is that tests are passing locally... do you know what this is about @snowman2 ?

@snowman2
Copy link
Member

snowman2 commented Feb 9, 2022

But I see a bunch of tests failing?

They should pass if you rebase from the master branch.

@snowman2 snowman2 merged commit abb66c2 into corteva:master Feb 9, 2022
@snowman2
Copy link
Member

snowman2 commented Feb 9, 2022

Thanks @mraspaud 👍

@snowman2
Copy link
Member

snowman2 commented Feb 9, 2022

Hmmm, I must have mixed up the failures with other failures. Looks like there may be some other things to check in previous versions of rasterio. I will open up a new issue to investigate.

@snowman2
Copy link
Member

snowman2 commented Feb 9, 2022

#464

@mraspaud mraspaud deleted the feature-add-gcps branch February 9, 2022 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Storing GCPs in xarray
3 participants