-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random Forest: Training/Regression, Classifier/Predicting... #295
Comments
We'll need training as well, as the saved model formats may be specific to the implementation used? |
Ok good. I wasn't sure whether this would be provided through file upload but that's actually not yet a thing in Platform.
I guess that depends a lot on how the individual processes for training, classification and regression would look like afterwards. If you have a lot of parameters, they should probably be separate otherwise you end up in a mess with schemas. If they are just "choose a method and a file" or so, we might be able to merge them into a generic one. Let's see, I still need to do more research as I don't have a lot of experience with all this, unfortunately... |
Recap of today's meeting on the randomForest process:
For more information, I put the Presentation here. This is a kickstarter for the UC8 implementation |
Some feedback based on internal discussion at VITO:
|
Thanks, helpful! Here is a sketch of the process(es), as I see them, high-level (for pixel-wise ML methods, such as RF). Following the ML terminology, I use labels for the response (e.g. crop type; either a class variable or a continuous variabe) and features for the predictors (e.g. the bands, or bands x time, based on which a RF predicts a class given a model). As @mattia6690 notes, there are two separate steps: A train model, B predict on new features A train model
See below for how we get to these input data, e.g. from polygons B Predict (classify, regress)
data for A: train modelTypical steps needed before we can train the model (A3) are:
Note that step A1.2 + A2: for a set of polygons and a raster (cube), return the raster pixel centers and all the associated pixel values, is a very common operation; in R it is usually called |
Nice overview! For prediction (B1/B2), instead of having special cases of apply/reduce dimension, could a prediction process also simply be a callback? Wouldn't that integrate better in the whole processes framework? |
Yes, that's actually what we discussed yesterday but Edzer did not mention it explicitly. So to visualize it with a bit of JS-like pseudo-code for B1:
Not fully fleshed out yet, but to give an idea... |
We need two (or one?) new processes for Random Forest that support classification and regression.
Would training happen outside of openEO for now?
Implementations:
PS: That's a lot of parameters, wow!
-> Related: save_model / load_model with GLMLC metadata: #300
The text was updated successfully, but these errors were encountered: