draft migration path doc from ML Model to MLM Extension #19

rbavery · 2024-08-07T21:49:29Z

This adds a migration doc that describes similarities and differences in capabilities between the ML Model Extension and the Machine Learning Model Extension. It also highlights how some field names have changed, enums have been expanded, and some fields have been removed. It notes that the ML Model Extension is not maintained and it's primary public facing implementation is deprecated.

If this is approved and merged I think we should then approve and merge #16. Once the repo ownership of https://github.com/crim-ca/mlm-extension/tree/main is changed to STAC Extensions, then I can update the links in this migration document to point to the new repo url.

I think I will need assistance from the STAC Extension maintainers to make a request to CRIM to initiate a move of the entire repo since we want to preserve stars, github history, issues, etc. cc @fmigneault

gadomski

This is not a strongly held opinion, but would we make lives easier on ourselves if we migrated the mlm-extension repo to the stac-extensions organization first, then updated the text here? To me it feels funny to merge this PR knowing that we'll have to go through and update a bunch of URLs shortly thereafter.

Another nice-to-have would be a migration script that we could point folks to ... maybe as part of the stac-model repo?

Other than those two points, LGTM.

rbavery · 2024-08-14T22:40:06Z

Thanks for the comments @gadomski

Unfortunately I don't have the bandwidth to provide a migration script right now. There might be some nontrivial decisions to handle in the migration. But I'm happy to review and work with others on PRs. stac-model is maintained alongside the spec here https://github.com/crim-ca/mlm-extension/blob/main/README_STAC_MODEL.md

I'm happy to merge this PR after the migration. If STAC maintainers initiate that @fmigneault can you accept the invite to do so for the https://github.com/crim-ca org? I'm not sure how full repo migrations work and who initiates.

gadomski · 2024-08-20T13:46:27Z

I'm not sure how full repo migrations work and who initiates.

I've done it a few times (for https://github.com/stactools-packages/) but it's been a minute. I believe @fmigneault can go to https://github.com/crim-ca/mlm-extension/transfer and then choose stac-extensions as the target organization — I'd try that first, at least.

fmigneault · 2024-08-20T16:24:10Z

MIGRATION_TO_MLM.md

+
+## Shared Goals
+
+Both the ML Model Extension and the Machine Learning Model (MLM) Extension aim to provide a standard way to catalog machine learning (ML) models that work with Earth observation (EO) data. Their main goals are:


I suggest adding something like "but not limited to" regarding the mention of working with EO data. This is one of the main goals of generalization provided by MLM, although using STAC often implies EO data is used somewhere.

fmigneault · 2024-08-20T16:25:50Z

MIGRATION_TO_MLM.md

+Both the ML Model Extension and the Machine Learning Model (MLM) Extension aim to provide a standard way to catalog machine learning (ML) models that work with Earth observation (EO) data. Their main goals are:
+
+1. **Search and Discovery**: Helping users find and use ML models.
+2. **Describing Inference Requirements**: Making it easier to run these models by describing input requirements and outputs.


Consider replacing "inference" by "execution" or "usage", or add "training" explicitly?
Technically, training requirements can be defined as well inference using mlm:training-runtime.

fmigneault · 2024-08-20T16:30:01Z

MIGRATION_TO_MLM.md

+
+- The MLM Extension covers more details at both the Item and Asset levels, making it easier to describe and use model metadata.
+- The MLM Extension covers Runtime requirements within the [Container Asset](https://github.com/crim-ca/mlm-extension?tab=readme-ov-file#container-asset), while the ML Model Extension records [similar information](./README.md#inferencetraining-runtimes) in the `ml-model:inference-runtime` or `ml-model:training-runtime` asset roles.
+- The MLM extension has a corresponding Python library, [`stac-model`](https://pypi.org/project/stac-model/) which can be used to create and validate MLM metadata. An example of the library in action is [here](https://github.com/crim-ca/mlm-extension/blob/main/stac_model/examples.py#L14). The ML Model extension does not support this and requires the JSON to be written manually by interpreting the JSON Schema or existing examples.


Adding a mention of its direct interoperability with common utilities (ie: pydantic and pystac) would be a great justification to further motivate its use.

fmigneault · 2024-08-20T16:31:35Z

MIGRATION_TO_MLM.md

+| `ml-model:type`                    | N/A                | No direct equivalent, it is implied by the `mlm` prefix in MLM fields and directly specified by the schema identifier.                                                                                                                                                                                                                                    |
+| `ml-model:learning_approach`       | `mlm:tasks`        | Removed in favor of specifying specific `mlm:tasks`.                                                                                                                                                                                                                                                                                                      |
+| `ml-model:prediction_type`         | `mlm:tasks`        | `mlm:tasks` provides a more comprehensive enum of prediction types.                                                                                                                                                                                                                                                                                       |
+| `ml-model:architecture`            | `mlm:architecture` | The MLM provides specific guidance on using Papers With Code - Computer Vision identifiers for model architectures. No guidance is provided in ML Model.                                                                                                                                                                                                  |


add * around the "Papers With Code - Computer Vision"

fmigneault · 2024-08-20T16:40:58Z

MIGRATION_TO_MLM.md

+| `ml-model:learning_approach`       | `mlm:tasks`        | Removed in favor of specifying specific `mlm:tasks`.                                                                                                                                                                                                                                                                                                      |
+| `ml-model:prediction_type`         | `mlm:tasks`        | `mlm:tasks` provides a more comprehensive enum of prediction types.                                                                                                                                                                                                                                                                                       |
+| `ml-model:architecture`            | `mlm:architecture` | The MLM provides specific guidance on using Papers With Code - Computer Vision identifiers for model architectures. No guidance is provided in ML Model.                                                                                                                                                                                                  |
+| `ml-model:training-processor-type` | `mlm:accelerator`  | MLM defines more choices for accelerators in an enum and specifies that this is the accelerator for inference (the focus of the MLM extension is inference). ML Model only accepts `cpu` or `gpu` but this isn't sufficient today where we have models optimized for different CPU architectures, CUDA GPUs, Intel GPUs, AMD GPUs, Mac Silicon, and TPUs. |


I suggest removing the focus on inference part.

Many examples use pretrained models in inference for simplicity, but there shouldn't be anything limiting the definition for training. If anything is missing, I would rather motivate users to submit issues for improvements of future MLM revisions to better handle training (if anything is actually missing).

fmigneault · 2024-08-20T16:45:29Z

MIGRATION_TO_MLM.md

+- **`mlm:name`**: A required name for the model.
+- **`mlm:framework`**: The framework used to train the model.
+- **`mlm:framework_version`**: The version of the framework. Useful in case a container runtime asset is not specified or if the consumer of the MLM wants to run the model outside of a container.
+- **`mlm:memory_size`**: The in-memory size of the model.
+- **`mlm:total_parameters`**: Total number of model parameters.
+- **`mlm:pretrained`**: Indicates if the model is derived from a pretrained model.
+- **`mlm:pretrained_source`**: Source of the pretrained model by name or URL if it is less well known.
+- **`mlm:batch_size_suggestion`**: Suggested batch size for the given accelerator.
+- **`mlm:accelerator_constrained`**: Indicates if the model requires a specific accelerator.
+- **`mlm:accelerator_summary`**: Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration.
+- **`mlm:accelerator_count`**: Minimum number of accelerator instances required.
+- **`mlm:input`**: Describes the model's input shape, dtype, and normalization and resize transformations.
+- **`mlm:output`**: Describes the model's output shape and dtype.
+- **`mlm:hyperparameters`**: Additional hyperparameters relevant to the model.


I suggest using a table just to keep sections consistent and more readable.

Suggested change

- **`mlm:name`**: A required name for the model.

- **`mlm:framework`**: The framework used to train the model.

- **`mlm:framework_version`**: The version of the framework. Useful in case a container runtime asset is not specified or if the consumer of the MLM wants to run the model outside of a container.

- **`mlm:memory_size`**: The in-memory size of the model.

- **`mlm:total_parameters`**: Total number of model parameters.

- **`mlm:pretrained`**: Indicates if the model is derived from a pretrained model.

- **`mlm:pretrained_source`**: Source of the pretrained model by name or URL if it is less well known.

- **`mlm:batch_size_suggestion`**: Suggested batch size for the given accelerator.

- **`mlm:accelerator_constrained`**: Indicates if the model requires a specific accelerator.

- **`mlm:accelerator_summary`**: Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration.

- **`mlm:accelerator_count`**: Minimum number of accelerator instances required.

- **`mlm:input`**: Describes the model's input shape, dtype, and normalization and resize transformations.

- **`mlm:output`**: Describes the model's output shape and dtype.

- **`mlm:hyperparameters`**: Additional hyperparameters relevant to the model.

| Field Name | Description |

|----------------------------------|-------------------------------------------------------------------------------------------------------------------------|

| **`mlm:name`** | A required name for the model. |

| **`mlm:framework`** | The framework used to train the model. |

| **`mlm:framework_version`** | The version of the framework. Useful in case a container runtime asset is not specified or if the consumer of the MLM wants to run the model outside of a container. |

| **`mlm:memory_size`** | The in-memory size of the model. |

| **`mlm:total_parameters`** | Total number of model parameters. |

| **`mlm:pretrained`** | Indicates if the model is derived from a pretrained model. |

| **`mlm:pretrained_source`** | Source of the pretrained model by name or URL if it is less well known. |

| **`mlm:batch_size_suggestion`** | Suggested batch size for the given accelerator. |

| **`mlm:accelerator_constrained`**| Indicates if the model requires a specific accelerator. |

| **`mlm:accelerator_summary`** | Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration. |

| **`mlm:accelerator_count`** | Minimum number of accelerator instances required. |

| **`mlm:input`** | Describes the model's input shape, dtype, and normalization and resize transformations. |

| **`mlm:output`** | Describes the model's output shape and dtype. |

| **`mlm:hyperparameters`** | Additional hyperparameters relevant to the model. |

fmigneault · 2024-08-20T16:47:18Z

MIGRATION_TO_MLM.md

+- **`mlm:accelerator_constrained`**: Indicates if the model requires a specific accelerator.
+- **`mlm:accelerator_summary`**: Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration.
+- **`mlm:accelerator_count`**: Minimum number of accelerator instances required.


Missing the mlm:accelerator?

fmigneault · 2024-08-20T16:51:40Z

MIGRATION_TO_MLM.md

+| `ml-model:inference-runtime` | `mlm:inference-runtime` | Direct conversion; same role and function.                                                         |
+| `ml-model:training-runtime`  | `mlm:training-runtime`  | Direct conversion; same role and function.                                                         |
+| `ml-model:checkpoint`        | `mlm:checkpoint`        | Direct conversion; same role and function.                                                         |
+| N/A                          | `mlm:model`             | New required role for model assets in MLM. This represents the asset that is loaded for inference. |


I think it would be better to describe this as the "source" of the model weights and definition (typically an ONNX, a checkpoint or similar). This would allow the flexibility for re-training/fine-tuning rather than only inference. However, the note about loading this asset can be left as well.

fmigneault · 2024-08-20T16:55:01Z

MIGRATION_TO_MLM.md

+| N/A                          | `mlm:inference`         | Recommended for inference pipelines.                                                               |
+
+
+The MLM is focused on search, discovery descriptions, and reproducibility of inference. Nevertheless, the MLM provides a recommended asset role for `mlm:training-runtime` and asset `mlm:training`, which can point to a container URL that has the training runtime requirements. The ML Model extension specifies a field for `ml-model:training-runtime` but like `mlm:training` it only contains the default STAC Asset fields and additional fields specified by the Container Asset. Training requirements typically differ from inference requirements so therefore we recommend that fields and assets for reproducing model training or fine-tuning models be contained in a separate STAC extension.


Same as previous. Adjust "inference" for "runtime" or similar.
The intent would better be described as reproducibility of experiments, whether those are for training, fine-tuning or inference.

I agree with the start of the last sentence about differing requirements (hence the separate runtimes), but I strongly disagree about using another extension. This will only cause a lot of metadata duplication (already seeing a similar trend happening with OGC's TrainingDML-AI sub "model-extension"...). There are already examples of using MLM for both training and inference simultaneously. I would rather improve upon MLM to add any missing gaps to describe the harder "training" scenario.

Good feedback! With regard to training, Francis and I talked and we agree mlm:hyperparameters should be specified as a field that is for including training hyperparameters. Inference scalars or other "variables" that are not tensors/arrays (like temperature setting, text input) can be represented in the InputStructure. In the future we might want to consider inference parameters or inference variables as a separate field.

I'll tweak language throughout to remove the focus on inference at expense of training. Will note that required fields are marked required either for inference or search and discovery.

fmigneault · 2024-08-20T17:13:44Z

@rbavery @gadomski
I believe migration would be the right way to handle the transfer to preserve stars, issues, etc. I think GitHub even defines some redirects for people using the old links (but don't quote me on that 😅). This would also allow CRIM to re-fork the stac-extensions repo to indicate appropriate redirects and maintain a mirror of the old schemas (https://crim-ca.github.io/mlm-extension/), since those are employed by existing STAC deployments.

However, once this is done (and that I get the approvals from CRIM higher ups), I will need from an admin of stac-extensions to provide the necessary permissions to adjust the settings. The migration page displays a warning that all configured protection rules will be removed during the transfer. I would like to re-enable those, and make sure @rbavery, @languith and @fmigneault (at least) have the necessary access to manage the repository.

gadomski · 2024-08-20T17:33:19Z

I will need from an admin of stac-extensions to provide the necessary permissions to adjust the settings.

No problem at all ... when migrated, I'll be able to add you as a maintainer of the repo so you can PR (or just YOLO change) anything as needed. You will also be able to add others as needed.

draft migration path doc from ml-model to MLM

0989894

rbavery requested review from fmigneault, emmanuelmathot, gadomski and kbgg August 7, 2024 22:08

gadomski reviewed Aug 9, 2024

View reviewed changes

fmigneault reviewed Aug 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

draft migration path doc from ML Model to MLM Extension #19

draft migration path doc from ML Model to MLM Extension #19

rbavery commented Aug 7, 2024

gadomski left a comment

rbavery commented Aug 14, 2024

gadomski commented Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

fmigneault Aug 20, 2024

rbavery Aug 21, 2024

fmigneault commented Aug 20, 2024

gadomski commented Aug 20, 2024


		## Shared Goals

		Both the ML Model Extension and the Machine Learning Model (MLM) Extension aim to provide a standard way to catalog machine learning (ML) models that work with Earth observation (EO) data. Their main goals are:

-- **`mlm:name`**: A required name for the model.
-- **`mlm:framework`**: The framework used to train the model.
-- **`mlm:framework_version`**: The version of the framework. Useful in case a container runtime asset is not specified or if the consumer of the MLM wants to run the model outside of a container.
-- **`mlm:memory_size`**: The in-memory size of the model.
-- **`mlm:total_parameters`**: Total number of model parameters.
-- **`mlm:pretrained`**: Indicates if the model is derived from a pretrained model.
-- **`mlm:pretrained_source`**: Source of the pretrained model by name or URL if it is less well known.
-- **`mlm:batch_size_suggestion`**: Suggested batch size for the given accelerator.
-- **`mlm:accelerator_constrained`**: Indicates if the model requires a specific accelerator.
-- **`mlm:accelerator_summary`**: Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration.
-- **`mlm:accelerator_count`**: Minimum number of accelerator instances required.
-- **`mlm:input`**: Describes the model's input shape, dtype, and normalization and resize transformations.
-- **`mlm:output`**: Describes the model's output shape and dtype.
-- **`mlm:hyperparameters`**: Additional hyperparameters relevant to the model.
+| Field Name                       | Description                                                                                                             |
+|----------------------------------|-------------------------------------------------------------------------------------------------------------------------|
+| **`mlm:name`**                   | A required name for the model.                                                                                          |
+| **`mlm:framework`**              | The framework used to train the model.                                                                                  |
+| **`mlm:framework_version`**      | The version of the framework. Useful in case a container runtime asset is not specified or if the consumer of the MLM wants to run the model outside of a container. |
+| **`mlm:memory_size`**            | The in-memory size of the model.                                                                                        |
+| **`mlm:total_parameters`**       | Total number of model parameters.                                                                                       |
+| **`mlm:pretrained`**             | Indicates if the model is derived from a pretrained model.                                                              |
+| **`mlm:pretrained_source`**      | Source of the pretrained model by name or URL if it is less well known.                                                 |
+| **`mlm:batch_size_suggestion`**  | Suggested batch size for the given accelerator.                                                                         |
+| **`mlm:accelerator_constrained`**| Indicates if the model requires a specific accelerator.                                                                 |
+| **`mlm:accelerator_summary`**    | Description of the accelerator. This might contain details on the exact accelerator version (TPUv4 vs TPUv5) and their configuration. |
+| **`mlm:accelerator_count`**      | Minimum number of accelerator instances required.                                                                       |
+| **`mlm:input`**                  | Describes the model's input shape, dtype, and normalization and resize transformations.                                 |
+| **`mlm:output`**                 | Describes the model's output shape and dtype.                                                                           |
+| **`mlm:hyperparameters`**        | Additional hyperparameters relevant to the model.                                                                       |

		\| N/A \| `mlm:inference` \| Recommended for inference pipelines. \|


		The MLM is focused on search, discovery descriptions, and reproducibility of inference. Nevertheless, the MLM provides a recommended asset role for `mlm:training-runtime` and asset `mlm:training`, which can point to a container URL that has the training runtime requirements. The ML Model extension specifies a field for `ml-model:training-runtime` but like `mlm:training` it only contains the default STAC Asset fields and additional fields specified by the Container Asset. Training requirements typically differ from inference requirements so therefore we recommend that fields and assets for reproducing model training or fine-tuning models be contained in a separate STAC extension.

draft migration path doc from ML Model to MLM Extension #19

Are you sure you want to change the base?

draft migration path doc from ML Model to MLM Extension #19

Conversation

rbavery commented Aug 7, 2024

gadomski left a comment

Choose a reason for hiding this comment

rbavery commented Aug 14, 2024

gadomski commented Aug 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmigneault commented Aug 20, 2024

gadomski commented Aug 20, 2024