This is a simple tutorial that walks through running a kaggle experiment on kubeflow on microk8s on ubuntu. These instructions are highly reproducible and you'll be able to leverage them for any competition, with the ability to run your experiments locally or in the cloud.
The demo will start with instructions for running everything within GCP, including a shell to run all the commands below. All you need is a browser.
At the end of the exercise, you'll end up with a deployment that looks something like this:
NB: This tutorial is inpired by the Kaggle on Kubeflow blog post on kubeflow.org.
This setup relies on scripts that you'll need to run from a shell.
An easy, compute independent way to get a shell is to launch one in the browser. Google's cloud console comes with this mechanism. So we'll start there.
Assuming you've created your google cloud account and you've created a project, go to the console and launch the shell:
This will bring up a section at the bottom of the screen, which is your shell:
From your shell, please run the following:
$> git clone https://github.com/canonical-labs/kaggle-kubeflow-tutorial.git
$> cd kaggle-kubeflow-tutorial
These instructions assume you have access to an Ubuntu operating system. If you already have access to an Ubuntu environment with sufficient memory (eg at least 16GB), processing power (eg at least 4 cores), and disk space (at least 100GB free), and you are comfortable installing this software locally, then skip this step and go to 2.
Another benefit of a virtual machine is that it will give you a known starting place. There are multiple ways to create a VM - either on your laptop if it has enough juice, in a public cloud, or in your datacenter.
In this example, we'll use Google's cloud to create a VM. A full set of options and scripts are provided in the gcp directory, including the gcp/README
The command summary to create a VM in GCP:
$> export GCP_PROJECT=<the Project Id of the project you created>
$> gcp/network_create.sh
$> gcp/compute_create.sh
## NB: copy_scripts.sh will error if the VM isn't ready. Please re-run until it is successful.
$> gcp/copy_scripts.sh
$> gcp/compute_ssh.sh
NB: experimental versions of Kubeflow don't require a GITHUB TOKEN. Once tested, these instructions will be updated and this step will be removed
You'll need to generate a github token. You can do this on github
- You only need to grant access to public repositories (eg public_repo checkbox)
The following commands are idempotent - they only install things if they are missing. At the end, you'll have a Kubeflow that you can log into. The last command will print the port number of the JupyterHub notebook. Combine that with the IP address of your ubuntu machine (eg the external IP address of a GCP VM instance)
:~$ export GITHUB_TOKEN=<your token>
:~$ ./scripts_download.sh
:~$ ./scripts_run.sh
You should see the Jupyter port number at the end:
NB: If you run into errors, run the cleanup script and try again:
:~$ ./cleanup_k8s.sh
:~$ ./expose-dashboard.sh
Running this script on the server will do the following:
- Run kubectl proxy
- Add an RBAC account and cluster role binding for access
- Print the token for that relationship (to be used in the UI)
- Print the URL to use.
At this point, there is nothing more that you need to do directly on the server. Time to exit the VM.
:~$ exit
NB: This step can take several minutes. If running in a public cloud, this can take around 10 minutes. If running locally, the time will be based on your network speed and latency.
- If running a VM, go back to your laptop and run:
gcp/compute_list.sh
- Use the EXTERNAL_IP address in your browser, combined with the PORT address from the
./scripts_run.sh
command.
http://<EXTERNAL_IP>:<PORT>
This next part will take around 10 minutes when running in a public cloud. Locally it could take much longer. The reason is that the kaggle image is around 21GB. In other words, you are downloading 21GB of data.
This is what the screen should look like after you enter these values:
- kaggle image:
gcr.io/kubeflow-images-public/kaggle-notebook:v20180713
- CPU:
4.0
- MEM:
10Gi
The screen:
- You need to join kaggle and download an API token.
- You'll need to join the titanic competition
Jupyter lets you create a terminal, ie a shell, which allows you to run commands.
The result is the start of a new Terminal that allows you to enter the commands in the next section
Now that you have a terminal on your Jupyter account, you'll need to install the kaggle CLI:
jovyan@jupyter-tutorial:~$ export PYTHONUSERBASE=/home/jovyan/.local
jovyan@jupyter-tutorial:~$ pip install --user kaggle
jovyan@jupyter-tutorial:~$ export PATH=/home/jovyan/.local/bin:$PATH
The Kernel holds the notebook, which you'll start from the web browser.
jovyan@jupyter-tutorial:~$ export KAGGLE_USERNAME=<your account>
jovyan@jupyter-tutorial:~$ export KAGGLE_KEY=<your key>
jovyan@jupyter-tutorial:~$ mkdir ~/input; cd ~/input; kaggle competitions download -c titanic
jovyan@jupyter-tutorial:~$ cd ~/work; kaggle kernels pull arthurtok/introduction-to-ensembling-stacking-in-python
Open the notebook in the browser, and run all cells.
Here are a few examples of the visualizations that are present in the notebook:
Once you are done exploring, please delete your VM. Otherwise you'll continue to incur some costs associated with the running VM.
gcp/compute_delete.sh
Here is a summary of all the steps, starting from the initial shell:
# From initial shell:
# =======================
git clone https://github.com/canonical-labs/kaggle-kubeflow-tutorial.git
cd kaggle-kubeflow-tutorial
export GCP_PROJECT=<the Project Id of the project you created>
gcp/network_create.sh
gcp/compute_create.sh
gcp/copy_scripts.sh
gcp/compute_ssh.sh
# From the VM:
# ============
export GITHUB_TOKEN=<your token>
./scripts_download.sh
./scripts_run.sh # note the jupyterhub port number
# Next Line is Optional
./expose-dashboard.sh
exit
# From the initial shell:
# =======================
gcp/compute_list.sh # note the external IP address
# From the browser
# ================
http://<EXTERNAL_IP>:<PORT>
# enter any username / password
# <click> Start My Server
# gcr.io/kubeflow-images-public/kaggle-notebook:v20180713 ,, 4.0 ,, 10Gi
# <click> Spawn
# From jupyter terminal:
# ======================
# <click> New --> Terminal
export PYTHONUSERBASE=/home/jovyan/.local
pip install --user kaggle
export PATH=/home/jovyan/.local/bin:$PATH
export KAGGLE_USERNAME=<your account>
export KAGGLE_KEY=<your key>
mkdir ~/input; cd ~/input; kaggle competitions download -c titanic
cd ~/work; kaggle kernels pull arthurtok/introduction-to-ensembling-stacking-in-python
# <click> Jupyter logo
# <click> work folder
# <click> the notebook introduction-to..
# <click> Cell -> Run All
# View the results and code
# From the initial shell:
# =======================
gcp/compute_delete.sh