New Machine Learning Model Extension Version 2.0.alpha schema and (de)serialization, validation package #2

rbavery · 2023-12-10T02:25:19Z

Here's a python package for validation, import, and export of the stac model metadata standard. I'm thinking this should be named stac-model on PYPI?

This uses the old model metadata standard I have been using and ports my pydantic code. I'll update this to get it in-line with the DLM spec and then it'll be ready for a code review.

I've linked everything in the README assuming this will be transferred to the stac-extensions repo pretty soon, which I think will help with visibility and adoption.

cc @fmigneault-crim

fmigneault · 2023-12-12T16:48:03Z

@rbavery
Great work! Thanks for this contribution. I will wait until you consider it ready for review.
I agree with stac-model on PyPI if the long-term decision is for this extension to replace ml-model extension.

…n example

rbavery · 2024-01-06T01:58:03Z

Hey @fmigneault happy new year! Here's an update on progress with the stac_model validator and updates to the extension.

I've made some substantial updates and looking to wrap this PR up early next week. I've made some extensive edits to the schema for more tasks and using common metadata objects where possible (Statistics, Raster, Asset Object).

The hackmd for this README is at https://hackmd.io/@cHP95b4sTDWQdP7uy1Vv7A/rkneCaru6 If you have any time to comment on it in the coming weeks I'd much appreciate it! You can also refer to this Changelog where I've tried to keep track of some substantial updates.

I'll need to make some further changes to make sure this complies with STAC Extension guidelines before this is ready to merge.

add jsonschema generation to produce schema like https://github.com/crim-ca/dlm-extension/blob/main/json-schema/schema.json
add STAC Extension required fields when generating instances of the schema
add a stac collection.json example with mlm (ML Model) fields like https://github.com/crim-ca/dlm-extension/blob/main/examples/item.json has dlm fields
incorporate lots of feedback!

fmigneault

Thanks @ rbavery for this first draft!

I didn't dig too much in the code (yet) since I felt there was already enough items to discuss about from the spec itself.

I'm guessing most of the generic files under stac_model such as editorconfig, changes, etc. would be moved at the root? Is this only temporary while working on it from another source repo copy?

CHANGELOG.md

fmigneault · 2024-01-08T20:14:08Z

README.md


 - Examples:
-  - [Example with a UNet trained with thelper](examples/item.json)
+  - [Example with a ??? trained with torchgeo](examples/item.json) TODO update example


The previously mentioned UNet is what the contents of https://github.com/crim-ca/dlm-extension/blob/main/examples/model-arch-summary.txt refers to.
This is based on https://github.com/nyoki-mtl/pytorch-segmentation/blob/master/src/models/decoder.py that uses a generic pytorch model.
If you can provide a more relevant model with torchgeo, then I'm fine with updating this reference.

thanks! I'd like to use torchgeo for the example since I think it is the most popular framework specific to EO for training deep learning models. They have pretrained models on EO datasets, Sentinel-2, Landsat, and they intend to host others.

I've been working with their 13 band pretrained Sentinel-2 Resnet model and was thinking of proposing this as the new example: https://torchgeo.readthedocs.io/en/stable/tutorials/trainers.html

Yes, I agree to use the more popular torchgeo. Just need to update the example with the corresponding output. The Sentinel-2 Resnet is good. I've been using it also.

The https://github.com/rbavery/dlm-extension/blob/validate/stac_model/example.json seems appropriate. Is there a persistent location where the .pt could be hosted?

huggingface is down now but I'll upload the Torchscript version of https://huggingface.co/torchgeo/resnet18_sentinel2_all_moco/ I've been working with that corresponds to the example.json

coming back to this, I can upload this after remaking a notebook demo with the new schema and a new version of stac_model

README.md

CHANGELOG.md

rbavery · 2024-01-08T23:54:06Z

Thanks @ rbavery for this first draft!

I didn't dig too much in the code (yet) since I felt there was already enough items to discuss about from the spec itself.

I'm guessing most of the generic files under stac_model such as editorconfig, changes, etc. would be moved at the root? Is this only temporary while working on it from another source repo copy?

Yes should have specified but let's focus on discussing the spec! I can move this to root once it is finalized.

README.md

…t or as separate inputs

…array for channelwise norm

…type and how to name model json by unique mlm:name

rbavery

@fmigneault here's my review from rbavery#2, it's looking really comprehsensive. Looking forward to using this extension!

rbavery · 2024-04-09T19:50:43Z

README.md

+| `detection`             | `detection`                 | Generic detection of the "presence" of objects or entities, with or without positions.                          |
+| `object-detection`      | *n/a*                       | Task corresponding to the identification of positions as bounding boxes of object detected in the scene.        |
+| `segmentation`          | `segmentation`              | Generic tasks that regroups all types of segmentations tasks consisting of applying labels to pixels.           |
+| `semantic-segmentation` | *n/a*                       | Specific segmentation task where all pixels are attributed labels, without consideration of similar instances.  |


Suggested change

| `semantic-segmentation` | *n/a* | Specific segmentation task where all pixels are attributed labels, without consideration of similar instances. |

| `semantic-segmentation` | *n/a* | Specific segmentation task where all pixels are attributed labels, without consideration for segments as unique objects. |

README.md

rbavery · 2024-04-09T19:51:27Z

README.md

+- `MXNet`
+- `Keras`
+- `Caffe`
+- `Weka`


should Weka be listed? I've never heard of it or seen it in the wild.

Here's a suggested reordering based on my subjective interpretation of current popularity + longevity.

I also added rgee and spatialRF to showcase some R options. especially in academia, lots of folks use R, particularly random forest models for semantic segmentation.

I removed Caffe (no updates in 4 years) and MxNet (archived last year). I don't think anyone will publish models for these frameworks.

Removed ONNX since it isn't a training framework and I think the purpose of this field is to describe the framework used to train the model. this might be different than the inference runtime and format.

Suggested change

- `Weka`

- `PyTorch`

- `TensorFlow`

- `Scikit-learn`

- `Huggingface`

- `Keras`

- `rgee`

- `spatialRF`

- `JAX`

- `PyMC`

Weka is not as common as others, but I have seen it once or twice. I prefer to have the option than not.
I think it is a good idea to include ONNX. Remember that MLM is not for training only. It can be used for inference runtime.
I don't think popularity or older frameworks should be ignored. People might want to support older algorithms that work well.
Will add the R-based references, those are good examples.

Sure I'm fine to include Weka then and include the old frameworks.

I think it is a good idea to include ONNX. Remember that MLM is not for training only. It can be used for inference runtime.

I totally agree (in fact I don't think MLM is a good fit for describing how to train a model), but we call out that this framework field should denote the framework used for training in the table. ONNX isn't used to train ML models. Instead, that can be denoted in the asset details. I f we include it, someone could get confused and not supply the actual framework used for training.

README.md

rbavery · 2024-04-09T19:53:56Z

README.md

+- `wrap-fill-outliers`
+- `wrap-inverse-map`
+
+See [OpenCV - Normalization Flags](https://docs.opencv.org/4.x/d2/de8/group__core__array.html#ga87eef7ee3970f86906d69a92cbf064bd)


I think this reference to normalization flags needs to be switched with the earlier reference to interpolation/resize methods.

long term I'd be interested in picking a different reference than OpenCV's C++ documentation, since the OpenCV lib is lower level than most folks encounter and the docs are a bit hard to follow (python programmer might get confused with the C data types for example). But I think this is better than us rolling our own.

I agree.

Ideally, MLM would have its own listing and description (with OpenCV or other libs as references for corresponding implementations), but I didn't feel like replicating the formulas provided by OpenCV here. Some normalizations have very subtle differences, so it is not really trivial to describe properly instead of simply showing the actual formula.

I'm open to other references, but unless we got them really quickly and exhaustively, I'd leave them to a follow-up PR not to delay this one further.

yes let's do this later, no need to block this

README.md

rbavery · 2024-04-09T19:55:44Z

README.md

+
+| Artifact Type      | Description                                                                                                              |
+|--------------------|--------------------------------------------------------------------------------------------------------------------------|
+| `torch.compile`    | A model artifact obtained by [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html).  |


Suggested change

| `torch.compile` | A model artifact obtained by [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html). |

| `.pt2` | A model artifact obtained using [Pytorch's AOTInductor](https://pytorch.org/docs/main/torch.compiler_aot_inductor.html). |

I think this is a necessary edit since torch.compile is an API for compiling pytorch nn.Modules. it's used to speed up Pytorch code in general, for training or inference. Many of the backend internals that make torch.compile work are used in AOTInductor (the tool that creates compiled model artifacts) but they aren't the same thing and I think here we want to refer to artifacts produced by AOTInductor

I've added some reference mentions of AOTInductor and torch.export. Let me know what you think of the edit.

I must say, though, I'm not 100% convinced about these definitions included in the spec (or rather, how explicit to be about them). They are still fairly prone to change soon. Even the doc mentions that many terms are used interchangeably.

https://pytorch.org/docs/main/torch.compiler.html#torch-compiler

In some cases, the terms torch.compile, TorchDynamo, torch.compiler might be used interchangeably in this documentation.

There is also a lot of ambiguity about torch.export using TorchDynamo under the hood, but then, torch.compile accepts backend='inductor' for TorchInductor (https://pytorch.org/docs/main/torch.compiler.html#torch-compiler), although they are also listed as distinct technologies under torch.compiler. And then, there is torch.export.save/torch.export.save for .pt2 (https://pytorch.org/docs/main/export.html#serialization) as parallel to the torch.save/torch.load for .pt.

I do not want to make the maintenance of the spec a burden for any new subtle change from pytorch. It's a fast-evolving ecosystem. Since this is an open field, I fear it is getting slightly too specific for this framework.

rbavery · 2024-04-09T19:56:03Z

README.md

+| Artifact Type      | Description                                                                                                              |
+|--------------------|--------------------------------------------------------------------------------------------------------------------------|
+| `torch.compile`    | A model artifact obtained by [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html).  |
+| `torch.jit.script` | A model artifact obtained by [`TorchScript`](https://pytorch.org/docs/stable/jit.html).                                  |


Suggested change

| `torch.jit.script` | A model artifact obtained by [`TorchScript`](https://pytorch.org/docs/stable/jit.html). |

| `torchscript` | A model artifact obtained by [`TorchScript Scripting`](https://pytorch.org/docs/stable/jit.html) and/or [`TorchScript Tracing`](https://pytorch.org/docs/stable/generated/torch.jit.trace.html). |

There are two types of graph capture in torchscript, trace and script. I think we can either enumerate both or leave only one option. Either traced or scripted models can be loaded the same way so I favor only one field for both of them.

Somewhat confusingly, both can also be used together though this is not common. https://ppwwyyxx.com/blog/2022/TorchScript-Tracing-vs-Scripting/

Since Torchscript is bring phased out in favor of AOTInductor, I think we shouldn't make this too complex and only provide them as one field.

Similar to other comment about the export. I'm not 100% convinced about this. I got somewhat mixed feelings about it.
I find that torchscript might just make it more ambiguous by not being explicit (about the actual function) whether it refers to torch.jit.trace or torch.jit.script, but then, it might also just be a technicality about pytorch that we should omit for MLM since this table is already extremely PyTorch-opinionated.

Also, I would like to remind that, even if things are being phased out, it does not mean we can ignore them. Some code/artifacts exists already, and users might what to represent them with the corresponding old technology.

fmigneault · 2024-04-11T00:34:02Z

@rbavery
I do not have the access to push directly to your repo, so you might need to re-merge https://github.com/crim-ca/dlm-extension/tree/validate into yours to reflect the changes here.

I also thought a bit more about the media-type for the artifact, and I think it is better to guide users in a single direction than multiple combinations.
Let me know about this 1fb5f21#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R508-R531

fmigneault · 2024-04-18T04:38:25Z

Almost ready to merge!
Worked around most linting issues and tests.
Only got this issue to complete: stac-utils/stac-node-validator#78

rbavery changed the title ~~initial package structure with example json~~ initial python package for model metadata validation Dec 10, 2023

python package, cli, and old ml model spec validation

4d3955f

rbavery force-pushed the validate branch from ab3b2c9 to 4d3955f Compare December 10, 2023 02:27

fmigneault self-assigned this Dec 12, 2023

fmigneault self-requested a review December 12, 2023 16:48

fmigneault assigned rbavery Dec 12, 2023

rbavery added 9 commits December 14, 2023 17:05

refactor models and replace data object with common metadata band object

0aa94fc

basic pydantic models for refactored model extension

d67ca11

readme updates

195a07b

start filling out base models with schema described in README and mai…

9b3ad07

…n example

mostly finish filling out object models

f2ccf4c

some changes to fields and language edits

e81bee9

poetry run stac-model generates json example

9645354

README updates

cdc6cba

add to CHANGELOG

f746d6c

rbavery mentioned this pull request Jan 8, 2024

how to reference the ML model extension from other STAC items/collections #3

Closed

fmigneault suggested changes Jan 8, 2024

View reviewed changes

rbavery added 2 commits January 9, 2024 12:05

address comments

c68dd72

address more comments

b5a1fc8

fmigneault reviewed Jan 9, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

rbavery added 7 commits January 9, 2024 22:22

address most first draft comments

ffa398e

add container instructions

0716ee3

fix fields missing object, add accelerator constrained field to runtime

ab74151

account for models that take parameters with one or more of each inpu…

eb8b80a

…t or as separate inputs

properly escape or operators

5378bae

language edits for model input, change stats to have type option for …

0c9f1f2

…array for channelwise norm

language edits down to Result Array Object, specify derived_from rel …

b2cc2f0

…type and how to name model json by unique mlm:name

remove out of date items from changelog

afe0a9a

rbavery commented Apr 9, 2024

View reviewed changes

include PR recommended changes

1fb5f21

fmigneault added 13 commits April 17, 2024 14:36

fix github ci command to instlal poetry

f1bee68

update ci commands

5ad13fa

update and fix markdown linting

a6bf8ee

fix missing jsonschema dependency

feb2ce0

fix typing definitions

8c62744

fix pydantic recursion error on JSON type

5de6693

more linting fixes

7f7620c

ignore for remark-lint

1d7d17a

add remark-lint ignore to npm scripts

a1872e8

downgrade remark-gfm

7f16176

drop remark-gfm causing issues

1a5927e

update node in CI and reapply remark-gfm

a1192d3

fix STAC examples linting

43b8691

fmigneault added 6 commits April 18, 2024 13:42

fix STAC MLM examples - remove old (invalid) DLM examples

7d95cca

fix STAC object self-references in python tests

728dcba

add Python 3.12 to CI + rename CI to be more representative than 'build'

ead9833

fix incorrectly interpretation of pydoclint exclude dirs

7ef029d

remove unnecessary package with dependency flagged by safety

b1804fc

remove unnecessary package with dependency flagged by safety

f2b7dc2

fmigneault merged commit c3a3a67 into crim-ca:main Apr 18, 2024
5 checks passed

fmigneault deleted the validate branch April 18, 2024 18:55

This was referenced Apr 18, 2024

Allow various Bands representations of Model Input Object #12

Closed

update schema to recommend storage extension crim-ca/mlm-extension#3

Closed

Roadmap for V2 of the ML Model Extension crim-ca/mlm-extension#4

Closed

rbavery mentioned this pull request Oct 16, 2024

Revise and expand artifact types to include more frameworks, remove torch.compile stac-extensions/mlm#31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Machine Learning Model Extension Version 2.0.alpha schema and (de)serialization, validation package #2

New Machine Learning Model Extension Version 2.0.alpha schema and (de)serialization, validation package #2

rbavery commented Dec 10, 2023

fmigneault commented Dec 12, 2023

rbavery commented Jan 6, 2024 •

edited

Loading

fmigneault left a comment

fmigneault Jan 8, 2024

rbavery Jan 9, 2024

fmigneault Jan 9, 2024

fmigneault Feb 19, 2024

rbavery Feb 28, 2024

rbavery Apr 9, 2024

rbavery commented Jan 8, 2024

rbavery left a comment

rbavery Apr 9, 2024

rbavery Apr 9, 2024

fmigneault Apr 10, 2024

rbavery Apr 10, 2024

rbavery Apr 9, 2024

fmigneault Apr 10, 2024

rbavery Apr 10, 2024

rbavery Apr 9, 2024

fmigneault Apr 11, 2024

rbavery Apr 9, 2024

fmigneault Apr 11, 2024

fmigneault commented Apr 11, 2024

fmigneault commented Apr 18, 2024

	\| `semantic-segmentation` \| n/a \| Specific segmentation task where all pixels are attributed labels, without consideration of similar instances. \|
	\| `semantic-segmentation` \| n/a \| Specific segmentation task where all pixels are attributed labels, without consideration for segments as unique objects. \|

-- `Weka`
+- `PyTorch`
+- `TensorFlow`
+- `Scikit-learn`
+- `Huggingface`
+- `Keras`
+- `rgee`
+- `spatialRF`
+- `JAX`
+- `PyMC`

	\| `torch.compile` \| A model artifact obtained by [`torch.compile`](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html). \|
	\| `.pt2` \| A model artifact obtained using [Pytorch's AOTInductor](https://pytorch.org/docs/main/torch.compiler_aot_inductor.html). \|

	\| `torch.jit.script` \| A model artifact obtained by [`TorchScript`](https://pytorch.org/docs/stable/jit.html). \|
	\| `torchscript` \| A model artifact obtained by [`TorchScript Scripting`](https://pytorch.org/docs/stable/jit.html) and/or [`TorchScript Tracing`](https://pytorch.org/docs/stable/generated/torch.jit.trace.html). \|

New Machine Learning Model Extension Version 2.0.alpha schema and (de)serialization, validation package #2

New Machine Learning Model Extension Version 2.0.alpha schema and (de)serialization, validation package #2

Conversation

rbavery commented Dec 10, 2023

fmigneault commented Dec 12, 2023

rbavery commented Jan 6, 2024 • edited Loading

fmigneault left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbavery commented Jan 8, 2024

rbavery left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmigneault commented Apr 11, 2024

fmigneault commented Apr 18, 2024

rbavery commented Jan 6, 2024 •

edited

Loading