-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Optimize CI with reusable image workflow #4782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Note Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported. |
|
Welcome @Deepam02! It looks like this is your first PR to volcano-sh/volcano 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the CI pipeline by introducing a centralized, reusable workflow for Docker image building, eliminating redundant image builds across parallel E2E test jobs. The optimization reduces CI execution time and resource consumption by building all Volcano images once and distributing them as artifacts.
Key Changes:
- Created
images.yamlreusable workflow that builds all four Volcano Docker images once and uploads them as artifacts with 5-day retention - Updated 10 E2E workflow files to depend on the centralized build job and load pre-built images instead of rebuilding
- Modified test execution commands to use the common
hack/run-e2e-kind.shscript with E2E_TYPE parameters
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/images.yaml |
New reusable workflow that builds and saves all Volcano Docker images as artifacts |
.github/workflows/e2e_vcctl.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_spark.yaml |
Added build-images dependency, downloads artifacts, modified image loading for minikube environment |
.github/workflows/e2e_sequence.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_scheduling_basic.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_scheduling_actions.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_parallel_jobs.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_hypernode.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_dra.yml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh with DRA feature gate |
.github/workflows/e2e_cronjob.yaml |
Added build-images dependency, downloads artifacts, and updated test execution to use run-e2e-kind.sh |
.github/workflows/e2e_admission.yaml |
Added build-images dependency and downloads artifacts for both admission policy and webhook test jobs |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.github/workflows/e2e_spark.yaml
Outdated
| # Use git SHA as TAG to match pre-built images | ||
| export TAG=$(git rev-parse --verify HEAD) | ||
| make TAG=${TAG} update-development-yaml |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The TAG is being set here using git rev-parse --verify HEAD, but the workflow still has an earlier step (lines 39-48) that sets TAG from .release-version or defaults to 'latest'. While that earlier step's output is now unused (overridden here), consider removing that obsolete step in a future cleanup to avoid confusion about where TAG comes from.
|
/cc @hajnalmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @Deepam02 !
Please find my comments below and let's see this in action if you managed to address them😀
Also please squash your commits, there is a merge commit at end of your commit history. Rebase your local branch on the master and do not merge it directly.
Thank you once more!
| with: | ||
| name: volcano-images | ||
| path: images | ||
| retention-days: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if this is maybe not enough, but let's stay with 5, you are probably right.
Signed-off-by: Deepam Goyal <[email protected]>
87843e1 to
4ceb1a3
Compare
|
Hi @hajnalmt |
hajnalmt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
Thank you for the change!
I had some minor comments still. Please resolve them too, but we can start the testing I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
@Deepam02
As the CI started to run I realized that this current structure is not enough as now every workflow triggers it's own image building.
What we want to achieve is that every workflow shall use the output artifact of the images workflow. This means that we need a separate e2e_test workflow that calls the current e2e workflows one by one (paralelly) after the images workflow.
This also means the modification of the current e2e workflows to be a reusable ones.
so we need to modify the on part to be:
on:
workflow_call:
instead of:
on:
push:
branches:
- master
tags:
pull_request:
And this new e2e_test workflow shall have this trigger instead.
I hope it's not a big a change. I completely missed this in the original specification.
Signed-off-by: Deepam Goyal <[email protected]>
Signed-off-by: Deepam Goyal <[email protected]>
|
I've restructured the workflows as suggested:
CI should be green now. Please review when you have a chance. |
|
/unhold Thank you! Let's see :) |
hajnalmt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/test all
|
@hajnalmt: No presubmit jobs available for volcano-sh/volcano@master DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest |
|
/ok-to-test cancel |
|
/cc @JesseStutler Can you trigger testing somehow on this one? We probably need to remove the ok-to-test label and readd it since the workflow structure changed. |
|
/rerun-all |
|
| with: | ||
| go-version: 1.24.x | ||
|
|
||
| - name: Install musl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are currently not using musl at all.
The Dockefiles are not processing the CC variable as arguments so it will use gcc.
scheduler Dockerfile
It's not even passed to the images target:
images target
It was just blindly copied each time a new CI job was introduced I guess.
Probably when Monokaix introduced the custom plugins:
https://github.com/volcano-sh/volcano/tree/master/example/custom-plugin
It was added everywhere for some reason, but It's hard to understand the reason behind this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so should we keep the musl if we don't know the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are not using it, we should delete the step I think. It just occupies the space and CI time for no reason.
| - master | ||
| tags: | ||
| pull_request: | ||
| workflow_call: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm not so familiar with workflow_call, so all the CIs can stil be triggered throught the e2e.yaml, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes hopefully, if we do a retrigger with retest it will only retrigger the failed tests too.
But some testing probably needs to be done on this to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is there because the master branch triggers them. It should disappear I think after we manage to merge this PR.
|
Seems that all the CIs have triggered but some CIs failed such as |
|
What's more, you still have built all the images in each workflow: |
hajnalmt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the images are rebuilt because we checkout the code in the e2e_* again. Basically we are checking everything out twice, so all the timestamps differ from the ones in the build_images step. You shall move everything to the e2e.yaml which can be done at the invoking job.
| - master | ||
| tags: | ||
| pull_request: | ||
| workflow_call: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is there because the master branch triggers them. It should disappear I think after we manage to merge this PR.
| tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1 | ||
| ./configure | ||
| make && sudo make install | ||
| - name: Checkout code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkout code and Install dependencies can be moved to e2e.yaml too! We don't need them in every e2e test now.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
|
||
| images: | ||
| for name in controller-manager scheduler webhook-manager agent; do\ | ||
| docker buildx build -t "${IMAGE_PREFIX}/vc-$$name:$(TAG)" . -f ./installer/dockerfile/$$name/Dockerfile --output=type=${BUILDX_OUTPUT_TYPE} --platform ${DOCKER_PLATFORMS} --build-arg APK_MIRROR=${APK_MIRROR} --build-arg OPEN_EULER_IMAGE_TAG=${OPEN_EULER_IMAGE_TAG}; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively to not rebuild the images, you can add a check here, to not rebuild if the tag already exists.
You can introduce a new variable to force the image build (like FORCE_REBUILD) which is true by default and start the target with false make images FORCE_REBUILD=false in the e2e tests, so you are backwards compatible with the old make target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this file name be more explicit docker_images.yaml for example ?



What this PR does / why we need it:
This PR optimizes the CI pipeline by extracting Docker image building into a centralized reusable workflow. Previously, each E2E test job built images independently, resulting in redundant builds and wasted resources.
Changes:
images.yamlas a reusable workflow that builds all Volcano Docker images oncedocker save/docker loadcommands for artifact handlingBenefits:
Which issue(s) this PR fixes:
Fixes #4766
Special notes for your reviewer:
The implementation follows GitHub's reusable workflow pattern. All E2E workflows now have a
needs: build-imagesdependency and download artifacts before running tests. The existing test infrastructure remains unchanged - only the image provisioning mechanism has been optimized.