-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support other stateful ML artifacts like transformers #179
Comments
Hi @gbolmier
Do you mean the model with transformer structure, or some transformation functions to process the data?
https://github.com/kleveross/ormb/blob/master/pkg/model/format.go The format is defined here. You can add a new format And, welcome contributions! |
Hi @gaocegege, thanks a lot for the prompt answer.
I'm referring to the second (e.g. standard scaler, pca, tf-idf vectorizer). These transformers are closely tied to the model, they often have hyperparameters that impact the model's performance and a state updated while processing the training data (like models). The model's performance on unseen data is dependent on the transformers used during the training phase, that's why stateful transformers are persisted to further process unseen data in the same way they processed the training data.
Thanks a lot for the pointer, cool this looks pretty straightforward. Follow-up question, let's say I want to share and publish some transformers tied to my ML model, do I have to create similar tree structures for each transformer along the model one?
If that's the case, could we make it more convenient in practice? |
What's your favorite srtructure? As you know, OCI supports layer-based storage like Docker Image, maybe we could discuss it further. |
Actually, it's not really the structure which is inconvenient, it's more about writing the |
/kind feature
What happened:
ML models often require stateful transformers to process data for them (e.g. standard scaler). Unfortunately, this kind of artifact isn't supported as of now.
Also some ML frameworks aren't supported, yet? Especially frameworks that don't use specific serialization formats, but rely on e.g. the
pickle
protocol.I'm not familiar with OCI stuff or the internals of registries, what's the process and the effort to add support for new frameworks or new serialization formats?
What you expected to happen:
Extended support to broader kinds of ML artifacts.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
The text was updated successfully, but these errors were encountered: