Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Annotator Trainability #244

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

boyleconnor
Copy link
Contributor

My first attempt at creating endpoints / schemas for:

  1. creating trained model files
  2. loading those trained model files

as well as reporting on the status of these two actions.

@boyleconnor boyleconnor marked this pull request as draft June 23, 2021 06:04
@boyleconnor boyleconnor marked this pull request as ready for review June 23, 2021 07:03
@boyleconnor
Copy link
Contributor Author

boyleconnor commented Jun 23, 2021

@tschaffter I can't request a review. Also I'm not sure why the publish workflow is failing and whether I should worry about it. (I don't see why it's trying to publish to gh-pages on a pull request?)

Copy link
Member

@tschaffter tschaffter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generic comment. Implement training only for the date and phi-annotator for now. Once the proposal is finalized extend to the other tools.

description: Whether the tool implements trainability
type: boolean
default: false
trainedAnnotatorStatus:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "annotator" refers to a specific type of tool. Prefer the term "tool" here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not efficient to query the entire Tool object just to know this status. I'd keep Tool for static information, and create a specific endpoint to get training status information with a light response.

loaded:
description: Whether the annotator is using a trained model
type: boolean
dataset:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this value come from?

description: Information about whether the annotator is using a trained model, and which one it is
properties:
loaded:
description: Whether the annotator is using a trained model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more descriptive here. For example, whether the trained model is ready to be used for inference. Also the "loading" status may be more related to inference that "training status". Consider moving this property to an "inference status" schema? What about introducing a ToolStatus - or ToolInferenceStatus - schemas? This "loaded" property could become ToolStatus.initialized = true or ToolStatus.ready = true. Here initialized or ready would be that in general the tool is ready to do what it's meant to do, i.e. generating predictions.

openapi/commons/components/schemas/TrainingStatus.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,17 @@
get:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename toolTrainingStatus or simply trainingStatus since we won't train anything else than tool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the endpoint of this path will be /tool/trainingStatus so no need to include tool again in this schema name.

Copy link
Contributor Author

@boyleconnor boyleconnor Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about /tool/training/status ?

Copy link
Contributor Author

@boyleconnor boyleconnor Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other one will be /tool/initialization/ and /tool/initialization/status?

openapi/commons/paths/annotatorTrainingStatus.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,18 @@
post:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading a trained model is a step of an initialization process. What about having a path initializeTraining and initializeInference?

@@ -0,0 +1,16 @@
post:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name it simply train.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint will be /tool/train

@tschaffter
Copy link
Member

@cascadianblue Please also describe

  • the entire workflow for training and apply (inference) the tool
  • the use case where the tool does not implement a training stage
  • the use case where the tool is started to be used directly for inference
    • when a trained model is available and should be loaded
    • when there is no trained model to load

@tschaffter
Copy link
Member

@cascadianblue Regarding the error in the workflow job CI / publish, it's OK to ignore it.

@boyleconnor
Copy link
Contributor Author

Note to self: add a walkthrough of which endpoints to call and in which order

@tschaffter
Copy link
Member

Thomas: the entire workflow for training and apply (inference) the tool

@cascadianblue Do we have this somewhere? I'll work on the training specification this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants