Skip to content

Latest commit

 

History

History
200 lines (155 loc) · 7.88 KB

developer_guide.md

File metadata and controls

200 lines (155 loc) · 7.88 KB

Installing Kubeflow Pipelines with Tekton for Development

Note: You can get an all-in-one installation of Kubeflow on IBM Cloud or Minikube, including Kubeflow Pipelines with Tekton backend by following the instructions here. If you would like to do it in development mode, or if you already have a Kubeflow deployment, please follow the instructions below.

Table of Contents

Prerequisites

  1. Install Tekton.
    • Minimum version: 0.41.0
  2. Patch the Tekton configs for KFP
    kubectl patch cm feature-flags -n tekton-pipelines \
        -p '{"data":{"enable-custom-tasks": "true"}}'
  3. Clone this repository
    git clone https://github.com/kubeflow/kfp-tekton.git
    cd kfp-tekton
    
  4. kubectl client version v1.22.0+ to support the new kustomize plugins.

Install Tekton KFP with pre-built images

  1. Remove the old version of KFP from a previous Kubeflow deployment and webhooks if it exists on your cluster.

    kubectl delete -k manifests/kustomize/env/platform-agnostic
    kubectl delete MutatingWebhookConfiguration cache-webhook-kubeflow
  2. Once the previous KFP deployment is removed, run the below command to deploy this modified Kubeflow Pipeline version with Tekton backend.

    kubectl create ns kubeflow # Skip this if kubeflow namespace aleady exists
    kubectl apply -k manifests/kustomize/env/platform-agnostic

    Check the new KFP deployment, it should take about 5 to 10 minutes. Please be aware that cache-deployer-deployment won't be running unless it's deployed on top of the entire Kubeflow stack with Cert Manager.

    kubectl get pods -n kubeflow

    Once all the pods except Cache deployer are running, run the commands below to expose your KFP UI to a public endpoint.

    kubectl patch svc ml-pipeline-ui -n kubeflow -p '{"spec": {"type": "LoadBalancer"}}'
    kubectl get service ml-pipeline-ui -n kubeflow -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

    Now that you have deployed Kubeflow Pipelines with Tekton, install the KFP-Tekton SDK and follow the KFP Tekton User Guide to start building your own pipelines.

Development: Building from source code

Prerequisites

  1. NodeJS 14 or above
  2. Golang 1.19 or above
  3. Python 3.7 or above

Frontend

The development instructions are under the frontend directory. Below are the commands for building the frontend docker image.

cd frontend
npm run docker

Backend

The KFP backend with Tekton uses a modified version of Kubeflow Pipelines api-server, persistent agent, and metadata writer.

  1. To build these two images, clone this repository under the GOPATH and rename it to pipelines.

    cd $GOPATH/src/go/github.com/kubeflow
    git clone https://github.com/kubeflow/kfp-tekton.git
    mv kfp-tekton pipelines
    cd pipelines
  2. For local binary builds, use the go build commands

    go build -o apiserver ./backend/src/apiserver
    go build -o agent ./backend/src/agent/persistence
    go build -o workflow ./backend/src/crd/controller/scheduledworkflow/*.go
    go build -o cache ./backend/src/cache/*.go

    Note: The metadata writer is written in Python, so the code will be compiled during runtime execution.

  3. For Docker builds, use the below docker build commands

    DOCKER_REGISTRY="<fill in your docker registry here>"
    docker build -t ${DOCKER_REGISTRY}/api-server -f backend/Dockerfile .
    docker build -t ${DOCKER_REGISTRY}/persistenceagent -f backend/Dockerfile.persistenceagent .
    docker build -t ${DOCKER_REGISTRY}/metadata-writer -f backend/metadata_writer/Dockerfile .
    docker build -t ${DOCKER_REGISTRY}/artifact-manager -f backend/artifact_manager/Dockerfile .
    docker build -t ${DOCKER_REGISTRY}/scheduledworkflow -f backend/Dockerfile.scheduledworkflow .
    docker build -t ${DOCKER_REGISTRY}/cache-server -f backend/Dockerfile.cacheserver .
  4. Push the images to registry and modify the Kustomization to use your own built images.

    Modify the newName under the images section in manifests/kustomize/base/kustomization.yaml.

    Now you can follow the Install Tekton KFP with pre-built images instructions to install your own KFP backend.

Minikube

Minikube can pick your local Docker image so you don't need to upload to remote repository.

For example, to build API server image

$ docker build -t ml-pipeline-api-server -f backend/Dockerfile .

Python based visualizations

Python based visualizations are a new method to visualize results within the Kubeflow Pipelines UI. For more information about Python based visualizations please visit the documentation page. To create predefine visualizations please check the developer guide.

Unit test

API server

Run unit test for the API server

cd backend/src/ && go test ./...

Frontend

TODO: add instruction

DSL

pip install ./dsl/ --upgrade && python ./dsl/tests/main.py
pip install ./dsl-compiler/ --upgrade && python ./dsl-compiler/tests/main.py

Integration test & E2E test

E2E test are done with on IBM Cloud Tekton Toolchain using this Tekton pipeline.

Troubleshooting

Q: How to access to the database directly?

You can inspect mysql database directly by running:

kubectl run -it --rm --image=gcr.io/ml-pipeline/mysql:5.6 --restart=Never mysql-client -- mysql -h mysql
mysql> use mlpipeline;
mysql> select * from jobs;

Q: How to inspect object store directly?

Minio provides its own UI to inspect the object store directly:

kubectl port-forward -n ${NAMESPACE} $(kubectl get pods -l app=minio -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE}) 9000:9000
Access Key:minio
Secret Key:minio123

Q: I see an error of exceeding Github rate limit when deploying the system. What can I do?

See Ksonnet troubleshooting page

Q: How do I check my API server log?

API server logs are located at /tmp directory of the pod. To SSH into the pod, run:

kubectl exec -it -n ${NAMESPACE} $(kubectl get pods -l app=ml-pipeline -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE}) -- /bin/sh

or

kubectl logs -n ${NAMESPACE} $(kubectl get pods -l app=ml-pipeline -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE})

Q: How to check my cluster status if I am using Minikube?

Minikube provides dashboard for deployment

minikube dashboard