Skip to content

natalian98/training-operator

This branch is 134 commits behind canonical/training-operator:main.

Folders and files

NameName
Last commit message
Last commit date
Dec 9, 2021
Nov 2, 2021
Dec 15, 2021
Nov 25, 2021
Sep 28, 2021
Oct 5, 2021
Nov 25, 2021
Dec 9, 2021
Nov 25, 2021
Nov 25, 2021
Nov 25, 2021

Repository files navigation

Training Operator

Overview

This repository hosts the Kubernetes Training Operator for Kubeflow training jobs.

Description

The Kubeflow Training Operator provides Kubernetes custom resources to run distributed or non-distributed training jobs, such as TFJobs and PytorchJobs. The Training Operator in this repository is a Python script which wraps the latest released Kubeflow Training Operator manifests, providing lifecycle management and handling events (install, upgrade, integrate, remove). It is one of the Charmed Kubeflow operators.

Usage

While it is possible to deploy the Training Operator as a standalone operator, it works best when deployed alongside other components included in the Kubeflow bundle. For installation steps, please refer to the installation guide.

About

Kubeflow Training Operator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%