-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sampling #313
Comments
The sampling process could also allow to "extract" all the valid points setting a parameter. This would cover the case where we select some polygons (filter_spatial) and we want to keep all the pixels inside of them for training. Some questions that come up in my mind:
|
The sampling will be performed using a new process called polygon_to_points:
|
|
And lastly @clausmichele , I'm a bit confused about: "takes as input [...] and returns the same structure with point coordinates". This sounds like you'd extract points along the polygon borders, but I assume you meant that it samples from the whole inner polygon, right? |
Sorry for being not so clear, with same structure I was referring to the geoJSON or vector-cube structure/data format, not the content itself.
I'm not sure actually. The percentage could be tricky since I would not know how to define a maximum number of points (100%) which could be extract from a vector layer, in theory they could be infinite! |
Thanks for the clarification, @clausmichele. I thought about basing percentages around the pixel centers. You should have a known number of pixels and could create a list of points for the pixel centers, right? |
From the last meeting I understood that the process should take as input geometries (geoJSON) and output also geometries to be used in aggregate_spatial, that's why I firstly called it "polygon_to_points". The output would be a list of points which can be used in aggregate_spatial to create a vector-cube for training the ML model. @mattia6690 @jdries can you comment on this? Do I remember correctly? |
Ah, okay! Then I misunderstood or remembered incorrectly. I can change that. That makes the process simpler by only creating points from a shape without directly combining them with the values from the raster cube. That's also fine for me, just didn't fully understood the idea then. Stay tuned for an updated version... :-) |
I am working on an implementation of |
This is my implementation of it https://github.com/SARScripts/openeo_odc_driver/blob/d1383aa872bdef5a8bde37f5d48b1f7a56cdd57e/openeo_odc_driver.py#L566 There's a lot of room for improvements, the properties are not kept for example. I will provide you also an example using vector_to_*_points soon. |
Thanks a lot, that's great! :) |
Co-authored-by: clausmichele <[email protected]>
For several use cases such as ML training, one or more sampling processes could be useful.
This came up in the discussions around #295
Related work:
Alternative:
The text was updated successfully, but these errors were encountered: