-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise and expand artifact types to include more frameworks, remove torch.compile #31
Comments
Although As for the other proposed types, I agree. We need more non-pytorch examples. The examples don't have to be an exhaustive list. The Lines 325 to 366 in 25bef80
|
I'm not saying that torch.compile is deprecated or anything, I'm saying it was never and currently is not used for saving model artifacts. so it shouldn't be listed as an artifact type. It is a very new and novel tool built for a different purpose: for optimizing eager code, not saving out full model graphs (model artifacts) I am confident that nobody is producing model artifacts with torch.compile, since it is by design unable to do this. The torch docs state this. torch.export and torch.compile share optimization code paths (Torch Inductor), but only torch.export produces model artifacts. |
I disagree.
Because |
Here's the example you linked from torchserve, which is discussing packaging both model code (that is optimized with torch.compile) and a state_dict (.pt), which is the model artifact file.
from that torchserve doc:
In this example, torch.compile needs to be called twice, at train time and inference time. this is because the artifact (model.pt) that is saved does not bake in the optimizations from torch.compile. therefore, torch.compile is not used to produce model artifacts. the model can be deployed without torch.compile. torch.compile is a detail about the runtime dependencies for modls that depend on eager Pytorch code.
this is true but it only happens in eager mode because there's no way to save artifacts that bake in these torch.compile optimizations into a model artifact. to do that, you need torch.export. |
to summarize the above, torch.compile optimizes a model's graph, but doesn't handle saving the model graph + state_dict weights. I think it'd be great to have a place in the mlm extension for calling out what hardware specific optimizations have been applied to an inference pipeline that depends on nn.Module or similar ml framework constructs. But I don't think the model artifact section is the right place to do that. |
My intuition would be that The idea of specifying Now, I'm not saying |
torch.compile doesn't affect the state_dict at all though. it has no bearing on the contents of the file from torch.save. I don't think we should guide users to call torch.compile an artifact type because it has no bearing on the contents of artifacts saved with torch.save. also it is not the convention to save state_dict weights as ".pt2". of course it is possible but the torch docs state it is convention to only use .pt2 from torch.export. I could call a torch.export file .pt just like I could call a .tiff file a .txt but that's be misleading and go against convention. |
The same docs also says:
and
Therefore, the convention is not really well-established. I cannot consciously force users toward As mentioned previously,
Therefore, even if the Again, I have no issue about adding more references to explanations/recommendations for using a preferred approach over another in the best-practices for better reproducibility, but I do not think this is a sufficient reason to remove |
maybe this is the core issue. are we overloading artifact_type with these two concepts: "content-type" and "intended use"? I think so. And I think it will confuse users looking at MLM metadata.
this seems like a lot to infer. I'd rather have a separate field that marks what inference optimizations are made in the inference pipeline that don't affect the content type. then it would be clear that the content-type field indicates how to load the model and the other, optimization focused field dictates the suggested inference code path to use at inference time. I also have no issue with adding more recommendations and I don't want to remove mention of torch.compile. I'm fine even with removing torch.export until the API is solidified and stable. I also don't think torch.compile is out of date or deprecated or it's an either or between torch.compile and torch.export, they serve different purposes. I just don't want users to think that we are hinting that it defines a unique content type. |
My motivation for being more explicit here wrt content-type vs. properties of the inference pipeline is that earlier this year I was very confused going through the Pytorch documentation on how to produce a compiled model artifact that had no runtime dependencies other than Pytorch. I think it is unfortunate that the docs do not make it clear that I spent a good amount of time thinking torch.compile would produce the model artifact I needed when it wasn't the right tool. I don't want to mislead users into doing the same. |
Somewhat (and maybe even more than 2 concepts). The "content-type" should really be indicated by the Note that https://github.com/stac-extensions/mlm#artifact-type-enum also explicitly indicates that the names like The reason a single If usage is not obvious from the Personally, I think this is such a specific edge-case, that we would be better off by just adding a " |
Pytorch, Tensorflow, and JAX all provide mechanisms for JIT and AOT compiled models. JIT and AOT models have very different deployment environments and level of effort from those looking to run the models. I think it is important that the MLM capture this variation, it's probably something users even want to search on because "level of effort to try the model out" is often a first order concern. Examples of AOT and JIT options besides Pytorch I think if we try to describe the artifact_types for JAX and Tensorflow models this would come up again. Therefore I'd like to be more explicit that the artifact_type is the method of how a model was saved. We're already doing this to some extent, with the exceptions being
I think this gets around the problem of a .pt file not being very informative. This would also contain information on if a model was AOT compiled or not. for those who find it useful, users can look up the framework specific methods listed here and learn more about them when choosing how to save and describe their models.
I think the MLM README is already too length and difficult to navigate, and I've had this reaction from some folks who have gone through it at Wherobots. IMO the spec is too open to interpretation and we are trying to fill in those gaps with recommendations. Rather than you and I providing more paragraph recommendations to plug gaps, I think we could make the spec more helpful by providing the options to accurately describe a model without ambiguity. I'm down to do the work here wrt to describing the artifact_type options. And how to represent if the model source code asset has a JIT compile step, this could be a an additional boolean field that indicates the presence of JIT compilation somewhere.
Ideally I would want there to be clear IANNA types but since there are not I don't see authors of MLM metadata using this if there are no conventional options. And as you noted, it wouldn't be informative enough on it's own to understand how to load the model because of .pt ambiguity. |
belatedly coming back to your earlier comment
agree, I can do this |
I agree. This is why I proposed adding this details in the best-practices. The full definition and table in the current https://github.com/stac-extensions/mlm#artifact-type-enum could actually be entirely in a specific best-practices section. From the point of view of the https://github.com/stac-extensions/mlm#model-asset, the
I think we need to refine this definition. Rather than There is also a |
after chatting we will add a mlm:compiled_method field and remove the torch.compile artifact_type. Good suggestion! I'll add docs on these fields with examples for a few frameworks. |
🚀 Feature Request
Currently we only suggest Pytorch artifact types.
And
torch.compile
is not an artifact type. From the pytorch docshttps://pytorch.org/docs/stable/export.html#existing-frameworks
So I think we should remove torch.compile from the list since no one should be using this to specify a model artifact type.
Here's an initial list that includes more frameworks
Scikit-learn (Python)
TensorFlow (Python)
ONNX (Open Neural Network Exchange) (Language-agnostic)
.onnx
PyTorch (Python)
Other Frameworks
XGBoost (framework specific binary format)
LightGBM (framework specific binary format)
PMML (Predictive Model Markup Language - XML)
R
.rds, .rda
???
Julia
JLD, JLD2
BSON
???
the above list was partially llm generated so take it with some salt, I can look into and confirm use if we decide to move forward wtih this and provide a more exhaustive set of options.
🔉 Motivation
We'd like users outside of the Pytorch ecosystem to understand how to describe their model artifacts so that it is easier to know what part of a framework should be used to load the model and the different runtime dependencies involved since different artifact types have different runtime dependencies.
Francis raised that the list is currently overly specific to Pytorch and I agree: crim-ca/dlm-extension#2 (comment)
📡 Alternatives
Keep this part of the spec focused on Pytorch and loose with no more recommendations.
📎 Additional context
I'm down to expand this suggestion. I think in terms of validation, we can be lax on the artifact value type that is passed here.
The text was updated successfully, but these errors were encountered: