Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHOAIENG-16076: tests(gha): run Makefile tests on opendatahub-io/notebooks Github Actions #775

Merged

Conversation

jiridanek
Copy link
Member

@jiridanek jiridanek commented Nov 22, 2024

https://issues.redhat.com/browse/RHOAIENG-16076

This is building up on the preparatory work merged in

Description

Deploys cri-o backed kubernetes with kubeadm to run our Makefile tests.

I considered using KinD, or Microshift, or Minikube, or OpenShift Local, or some other Kubernetes, but decided not to, because

The reason why I need cri-o is that I want to share the podman-built images into the cluster without having to copy them. Copying images is difficult because the images can be huge.

Known issues that I do intend to fix

Resolving raw.githubusercontent.com (raw.githubusercontent.com)... failed: Name or service not known.
wget: unable to resolve host address ‘raw.githubusercontent.com’
  • This needs a retry, because it happens from time to time. Also, we should not fetch test data from the main branch of the repo, we should use what's in the current checkout.

This went away after I redid my Kubernetes network. I had this problems with flannel, and since then I switched to a plain bridge with firewall masquerade. I think that (when I noticed this problem the first time, or maybe at some later point) my pods were completely unable to connect to the outside (also unable to connect to other pods, like the coredns pod, but I was dealing with other issues so I mistakenly thought the networking issues are a flakiness.


WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f95ec277fd0>: Failed to establish a new connection: [Errno -2] Name or service not known')': /simple/papermill/
ERROR: Could not find a version that satisfies the requirement papermill (from versions: none)
Notice:  A new release of pip is available: 24.2 -> 24.3.1
Notice:  To update, run: pip install --upgrade pip
ERROR: No matching distribution found for papermill
command terminated with exit code 1
  • Could it be that freshly started pod has spotty network and I need to wait?

Looks like the same problem as above. My pods were completely offline. The bridge solution needs no waiting!

Known issues (that I don't intend to fix for this PR)

  • GHA for a pull request skips intermediate images that have their own tests. Those tests will not be run, only leaf images in the dependency chain are having their tests run. (The push GHA tests everything.)

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@jiridanek jiridanek added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 22, 2024
@openshift-ci openshift-ci bot requested review from caponetto and jstourac November 22, 2024 17:08
@@ -46,6 +46,8 @@ jobs:
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

# region Free up disk space
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the VS Code-compatible syntax for collapsable regions. Allows to collapse parts of the file in IntelliJ and VS Code

image

@jiridanek jiridanek marked this pull request as ready for review November 28, 2024 07:04
…stry) and the symbolic links apparently needed to deploy rocm stuff
@jiridanek
Copy link
Member Author

/test ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-11-pr-image-mirror
/test ci/prow/rocm-notebooks-e2e-tests

This comment was marked as outdated.

@jiridanek
Copy link
Member Author

/test ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-11-pr-image-mirror

This comment was marked as outdated.

@jiridanek

This comment was marked as off-topic.

@jiridanek
Copy link
Member Author

/test rocm-notebooks-e2e-tests

@jiridanek
Copy link
Member Author

/test pull-ci-opendatahub-io-notebooks-main-notebook-rocm-jupyter-pyt-ubi9-python-3-11-pr-image-mirror

@jiridanek
Copy link
Member Author

/test rocm-notebooks-e2e-tests

@atheo89
Copy link
Member

atheo89 commented Nov 28, 2024

Great work Jiri!
/lgtm

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/runtimes-ubi9-e2e-tests e85aa9e link true /test runtimes-ubi9-e2e-tests
ci/prow/notebooks-ubi9-e2e-tests e85aa9e link true /test notebooks-ubi9-e2e-tests
ci/prow/codeserver-notebook-e2e-tests e85aa9e link true /test codeserver-notebook-e2e-tests
ci/prow/notebook-cuda-jupyter-tf-ubi9-python-3-11-pr-image-mirror e85aa9e link true /test notebook-cuda-jupyter-tf-ubi9-python-3-11-pr-image-mirror
ci/prow/notebook-jupyter-pytorch-ubi9-python-3-11-pr-image-mirror e85aa9e link true /test notebook-jupyter-pytorch-ubi9-python-3-11-pr-image-mirror
ci/prow/runtime-intel-tf-ubi9-python-3-11-pr-image-mirror e85aa9e link true /test runtime-intel-tf-ubi9-python-3-11-pr-image-mirror

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jiridanek
Copy link
Member Author

/approve

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiridanek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek
Copy link
Member Author

/override ci/prow/images

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: ci/prow/images

In response to this:

/override ci/prow/images

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

This comment was marked as outdated.

@jiridanek
Copy link
Member Author

/override build (rocm-jupyter-pytorch-ubi9-python-3.9) / build

This comment was marked as outdated.

@jiridanek
Copy link
Member Author

/override "build (rocm-jupyter-pytorch-ubi9-python-3.9) / build"

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: build (rocm-jupyter-pytorch-ubi9-python-3.9) / build

In response to this:

/override "build (rocm-jupyter-pytorch-ubi9-python-3.9) / build"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek

This comment was marked as outdated.

This comment was marked as outdated.

@jiridanek
Copy link
Member Author

/override "build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build"

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build

In response to this:

/override "build (rocm-jupyter-tensorflow-ubi9-python-3.11) / build"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member Author

/override ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-11-pr-image-mirror

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-11-pr-image-mirror

In response to this:

/override ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-11-pr-image-mirror

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member Author

/override ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-9-pr-image-mirror
/override ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-11-pr-image-mirror
/override ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-9-pr-image-mirror

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-9-pr-image-mirror, ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-11-pr-image-mirror, ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-9-pr-image-mirror

In response to this:

/override ci/prow/notebook-rocm-jupyter-pyt-ubi9-python-3-9-pr-image-mirror
/override ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-11-pr-image-mirror
/override ci/prow/notebook-rocm-jupyter-tf-ubi9-python-3-9-pr-image-mirror

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member Author

/override ci/prow/rocm-notebooks-e2e-tests

Copy link
Contributor

openshift-ci bot commented Nov 28, 2024

@jiridanek: Overrode contexts on behalf of jiridanek: ci/prow/rocm-notebooks-e2e-tests

In response to this:

/override ci/prow/rocm-notebooks-e2e-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit b3d8af0 into opendatahub-io:main Nov 28, 2024
15 of 17 checks passed
@jiridanek jiridanek deleted the jd_rebased_run_tests_gha branch November 28, 2024 16:29
commonLabels:
app: rocm-jupyter-pytorch-ubi9-python-3-11
app: jupyter-rocm-pytorch-ubi9-python-3-11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, didn't check - what does this affects? Is it safe? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not run without this change in gha, and I did not manage to make the tests run in openshift-ci, so, yeah, I think it makes things better overall

jiridanek added a commit to jiridanek/notebooks that referenced this pull request Dec 18, 2024
…ebooks Github Actions (opendatahub-io#775)

* RHOAIENG-16076: tests(gha): run Makefile tests in GitHub Actions

* fixup, looks like I lost the second changed line from opendatahub-io#761 (comment) when merging the work

* fixup, linter wants space in the comments; IntelliJ is ok with it, so let's do that

* fixup, add reference to OpenShift CI for the source of the make invocations

* fixup, the ifNotPresent pull policy (for PR checks without image registry) and the symbolic links apparently needed to deploy rocm stuff

(cherry picked from commit b3d8af0)
jiridanek added a commit to jiridanek/notebooks that referenced this pull request Dec 19, 2024
…books Github Actions (opendatahub-io#775)

* RHOAIENG-16076: tests(gha): run Makefile tests in GitHub Actions

* fixup, looks like I lost the second changed line from opendatahub-io#761 (comment) when merging the work

* fixup, linter wants space in the comments; IntelliJ is ok with it, so let's do that

* fixup, add reference to OpenShift CI for the source of the make invocations

* fixup, the ifNotPresent pull policy (for PR checks without image registry) and the symbolic links apparently needed to deploy rocm stuff

(cherry picked from commit b3d8af0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. trivy-scan This label that allows trivy to create a security report on the pull requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants