Skip to content
Arun Gupta edited this page Jul 19, 2019 · 10 revisions

Welcome to the machine-learning-using-k8s wiki!

Webinar

KubeFlow demo

  • Explain kfapp/aws_config/cluster_config.sh for GPUs
  • Explain kfapp/aws_config/cluster_features.sh for private access, disable endpoint, control/data plane logging
  • In kfapp/env.sh, explain KUBEFLOW_COMPONENTS and disabling of ALB and ingress controllers

Jupyter notebook

  • Do port forward:
    kubectl port-forward svc/centraldashboard -n kubeflow 8080:80
    
  • Access localhost:8080 in a browser to show KubeFlow central dashboard
  • Click on Notebooks
  • Create new server
  • Specify the name
  • Change the CPU (for faster processing)
  • Spawn server
  • Wait for it complete
  • Connect
  • Create a new notebook (top right)
  • Python3
  • Copy the code from https://github.com/aws-samples/machine-learning-using-k8s/blob/master/samples/mnist/training/tensorflow/mnist.py
  • Change args = parser.parse_args() to args = parser.parse_args(args=[])
  • Run
  • Delete last two lines
  • Run
  • Show the output

Single node training

  • Do port forward
    kubectl port-forward -n kubeflow `kubectl get pods -n kubeflow --selector=app=mnist -o jsonpath='{.items[0].metadata.name}' --field-selector=status.phase=Running` 8500:8500
    
  • Run inference:
    python samples/mnist/inference/tensorflow/inference_client.py --endpoint http://localhost:8500/v1/models/mnist:predict
    

Distributed training

Optional

  • TensorBoard
  • Katib
  • Fairing
  • KFServing
Clone this wiki locally