GitHub - courtneypacheco/ci-artifacts: OpenShift PSAP-team Ci Artifacts

This repository contains Ansible roles and playbooks for OpenShift for automating the interactions with the OpenShift operators under the responsibility Red Hat PSAP team.

Performance & Latency Sensitive Application Platform

To date, this includes:

NVIDIA GPU Operator (most of the repository relates to the deployment, testing and interactions with this operator)
the Special Resource Operator (deployment and testing currently under development)
the Node Feature Discovery
the Node Tuning Operator

Documentation

See the documentation pages.

Dependencies

Requirements:

ansible (>= 2.9.5), yq, jq

pip3 install yq ansible==2.9.*
dnf install jq

OpenShift Client (oc)

wget --quiet https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/latest/openshift-client-linux.tar.gz
tar xf openshift-client-linux.tar.gz oc

An OpenShift cluster accessible with $KUBECONFIG properly set

oc version # fails if the cluster is not reachable

Prow CI

The original purpose of this repository was to perform nightly testing of the OpenShift Operators under responsibility.

This CI testing is performed by OpenShift PROW instance. Is is controlled by the configuration files located in these directories:

The main configuration is written in the config directory, and jobs are then generated with make ci-operator-config jobs. Secondary configuration options can then be modified in the jobs directory.

The Prow CI jobs run in an OpenShift Pod. The ContainerFile is used to build their base-image, and the run (build/root/usr/local/bin/run) file is used as entrypoint.

From this entrypoint, we trigger the different high-level tasks of the operator end-to-end testing, eg:

run gpu-operator test-master-branch
run gpu-operator test-operatorhub
run gpu-operator validate-deployment
run gpu-operator undeploy-operatorhub
run cluster upgrade

These different high-level tasks rely on the toolbox scripts to automate the deployment of the required dependencies (eg, the NFD operator), the deployment of the operator from its published manifest or from its development repository and its non-regression testing.

CI Dashboard

The artifacts generated during the nightly CI testing are reused to plot a "testing dashboard" that gives an overview of the last days of testing. The generation of this page is performed by the ci-dashboard repository.

Currently, only the GPU Operator results are exposed in this dashboard:

PSAP Operators Toolbox

The PSAP Operators Toolbox is a set of tools, originally written for CI automation, but that appeared to be useful for a broader scope. It automates different operations on OpenShift clusters and operators revolving around PSAP activities: entitlement, scale-up of GPU nodes, deployment of the NFD, SRO and NVIDIA GPU Operators, but also their configuration and troubleshooting.

The functionalities of the toolbox commands are described in the documentation page.

$ tree toolbox | grep -v _common
toolbox
├── cluster
│   ├── capture_environment.sh
│   ├── set_scale.sh
│   └── upgrade_to_image.sh
├── entitlement
│   ├── deploy.sh
│   ├── inspect.sh
│   ├── test_cluster.sh
│   ├── test_in_cluster.sh
│   ├── test_in_podman.sh
│   ├── undeploy.sh
│   └── wait.sh
├── gpu-operator
│   ├── capture_deployment_state.sh
│   ├── cleanup_resources.sh
│   ├── deploy_from_commit.sh
│   ├── deploy_from_helm.sh
│   ├── deploy_from_operatorhub.sh
│   ├── diagnose.sh
│   ├── list_version_from_helm.sh
│   ├── list_version_from_operator_hub.sh
│   ├── must-gather.sh
│   ├── run_gpu_burn.sh
│   ├── set_repo-config.sh
│   ├── undeploy_from_commit.sh
│   ├── undeploy_from_helm.sh
│   ├── undeploy_from_operatorhub.sh
│   └── wait_deployment.sh
├── local-ci
│   ├── cleanup.sh
│   └── deploy.sh
├── nfd
│   ├── has_gpu_nodes.sh
│   ├── has_nfd_labels.sh
│   └── wait_nfd_labels.sh
├── nfd-operator
│   ├── deploy_from_operatorhub.sh
│   ├── deploy_from_commit.sh
│   ├── undeploy_from_operatorhub.sh
│   └── wait_gpu_nodes.sh
├── nto
│   └── run_e2e_test.sh
├── repo
│   ├── validate_role_files.py
│   └── validate_role_vars_used.py
└── special-resource-operator
    ├── capture_deployment_state.sh
    ├── deploy_from_commit.sh
    ├── run_e2e_test.sh
    └── undeploy_from_commit.sh

Name		Name	Last commit message	Last commit date
Latest commit History 439 Commits
.github/workflows		.github/workflows
build		build
callback_plugins		callback_plugins
config		config
docs		docs
group_vars		group_vars
playbooks		playbooks
roles		roles
subprojects		subprojects
toolbox		toolbox
.gitignore		.gitignore
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE		LICENSE
OWNERS		OWNERS
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation

Dependencies

Prow CI

CI Dashboard

PSAP Operators Toolbox

About

Releases

Packages

Languages

License

courtneypacheco/ci-artifacts

Folders and files

Latest commit

History

Repository files navigation

Documentation

Dependencies

Prow CI

CI Dashboard

PSAP Operators Toolbox

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages