Home

Welcome to the machine-learning-using-k8s wiki!

Webinar

KubeFlow demo

Explain kfapp/aws_config/cluster_config.sh for GPUs
Explain kfapp/aws_config/cluster_features.sh for private access, disable endpoint, control/data plane logging
In kfapp/env.sh, explain KUBEFLOW_COMPONENTS and disabling of ALB and ingress controllers

Jupyter notebook

Do port forward:

kubectl port-forward svc/centraldashboard -n kubeflow 8080:80

Access localhost:8080 in a browser to show KubeFlow central dashboard
Click on Notebooks
Create new server
Specify the name
Change the CPU (for faster processing)
Spawn server
Wait for it complete
Connect
Create a new notebook (top right)
Python3
Copy the code from https://github.com/aws-samples/machine-learning-using-k8s/blob/master/samples/mnist/training/tensorflow/mnist.py
Change args = parser.parse_args() to args = parser.parse_args(args=[])
Run
Delete last two lines
Run
Show the output

Single node training

Do port forward

kubectl port-forward -n kubeflow `kubectl get pods -n kubeflow --selector=app=mnist -o jsonpath='{.items[0].metadata.name}' --field-selector=status.phase=Running` 8500:8500

Run inference:

python samples/mnist/inference/tensorflow/inference_client.py --endpoint http://localhost:8500/v1/models/mnist:predict

Distributed training

Optional

TensorBoard
Katib
Fairing
KFServing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Webinar

KubeFlow demo

Jupyter notebook

Single node training

Distributed training

Optional

Clone this wiki locally