Skip to content
Arun Gupta edited this page Jul 18, 2019 · 10 revisions

Welcome to the machine-learning-using-k8s wiki!

Webinar

KubeFlow demo

  • Explain kfapp/aws_config/cluster_config.sh for GPUs
  • Explain kfapp/aws_config/cluster_features.sh for private access, disable endpoint, control/data plane logging
  • In kfapp/env.sh, explain KUBEFLOW_COMPONENTS and disabling of ALB and ingress controllers

Jupyter notebook

Single node training

  • Do port forward

    kubectl port-forward -n kubeflow `kubectl get pods -n kubeflow --selector=app=mnist -o jsonpath='{.items[0].metadata.name}' --field-selector=status.phase=Running` 8500:8500
    

Distributed training

Optional

  • TensorBoard
  • Katib
  • Fairing
  • KFServing
Clone this wiki locally