Refactor image builds to put all of the hardware and platform selection logic into one place. #720

cgruver · 2025-02-03T13:04:31Z

This is a proposed refactor of the image build process that I believe will make it easier to maintain as we add more hardware and platform types.

I moved all of the selection logic into a case statement in the main function.

Each image has its own selection in the case.

Summary by Sourcery

Consolidate hardware and platform selection logic for image builds into a single script.

Enhancements:

Standardize the installation directory to /usr for CUDA and Intel GPU builds.
Update the base image for Asahi builds to quay.io/fedora/fedora:41.

Build:

Introduce a unified build_llama_and_whisper.sh script to manage builds across different hardware and platforms (ramalama, ROCm, CUDA, Vulkan, Asahi, and Intel GPU).
Remove per-platform build scripts.

… one place. Signed-off-by: Charro Gruver <[email protected]>

sourcery-ai · 2025-02-03T13:04:36Z

Reviewer's Guide by Sourcery

This pull request refactors the image build process to consolidate hardware and platform selection logic into a single script, enhancing maintainability and consistency.

Flow diagram of the new build process

graph TD
    Start[Start Build] --> Script[build_llama_and_whisper.sh]
    Script --> SelectHW{Select Hardware Type}
    SelectHW --> InstallDeps[Install Common Dependencies]
    InstallDeps --> HWSpecific{Hardware-Specific Setup}
    HWSpecific --> ConfigCMake[Configure CMake Flags]
    ConfigCMake --> BuildLlama[Build llama.cpp]
    BuildLlama --> BuildWhisper[Build whisper.cpp]
    BuildWhisper --> Cleanup[Cleanup Build Files]
    Cleanup --> End[End Build]

    style SelectHW fill:#f96,stroke:#333,stroke-width:2px
    style HWSpecific fill:#f96,stroke:#333,stroke-width:2px

File-Level Changes

Change	Details	Files
Consolidated hardware and platform selection logic into a single `build_llama_and_whisper.sh` script.	Removed per-image build scripts. Centralized build logic in `container_build.sh`. Introduced a unified `build_llama_and_whisper.sh` script to manage builds across different hardware. The `container_build.sh` script now skips the `build_llama_and_whisper.sh` file when iterating through the container images.	`container_build.sh` `container-images/intel-gpu/Containerfile` `container-images/rocm/Containerfile` `container-images/cuda/Containerfile` `container-images/asahi/Containerfile` `container-images/ramalama/Containerfile` `container-images/vulkan/Containerfile` `container-images/build_llama_and_whisper.sh`
Standardized the base image and development package installations.	Standardized the base image to `quay.io/fedora/fedora:41` for Intel GPU and Asahi builds. Installed common development packages consistently across all images. Updated the installation path for libraries and binaries to `/usr` for CUDA and Intel GPU builds.	`container-images/intel-gpu/Containerfile` `container-images/rocm/Containerfile` `container-images/cuda/Containerfile` `container-images/asahi/Containerfile` `container-images/ramalama/Containerfile` `container-images/vulkan/Containerfile` `container-images/build_llama_and_whisper.sh`
Removed the `scripts` directory from the container image testing process.	Removed the `scripts` directory from the container image testing process.	`container-images/intel-gpu/Containerfile` `container-images/rocm/Containerfile` `container-images/cuda/Containerfile` `container-images/asahi/Containerfile` `container-images/ramalama/Containerfile` `container-images/vulkan/Containerfile`
Added an entrypoint script for the intel-gpu container.	Added an entrypoint script for the intel-gpu container. The entrypoint script sets up the environment for the intel-gpu container.	`container-images/intel-gpu/entrypoint.sh` `container-images/intel-gpu/Containerfile`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!
Generate a plan of action for an issue: Comment @sourcery-ai plan on
an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

cgruver · 2025-02-03T13:05:19Z

Note: Please review, but don't merge yet until we are sure I did not break any of the container images.

:-)

sourcery-ai

Hey @cgruver - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟢 General issues: all looks good
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Signed-off-by: Charro Gruver <[email protected]>

sourcery-ai

Hey @cgruver - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider refactoring the global CMAKE_FLAGS variables into function parameters or a configuration object to improve maintainability and testability.
The common CMake flags could be extracted into helper functions to reduce duplication across the different hardware targets.

Here's what I looked at during the review

🟢 General issues: all looks good
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

cgruver · 2025-02-03T15:54:00Z

The rocm image ballooned to 15.7GB... I need to make sure I didn't mess something up in that build...

Signed-off-by: Charro Gruver <[email protected]>

ericcurtin · 2025-02-03T19:39:06Z

container-images/build_llama_and_whisper.sh

+
+# Bash does not easily pass arrays as a single arg to a function.  So, make this a global var in the script.
+CMAKE_FLAGS=""
+LLAMA_CPP_CMAKE_FLAGS=""


If we need a global variables, could we put local variables in main instead? Trying to keep everything scoped.

The styling changes aren't a blocker for merge, but why change it?

We don't put "function" in front of the functions in any of the other shell scripts in the project, we don't use camelCase in any of the other shell scripts either.

Now this is the odd shell script in terms of consistency 😄

Yeah, I don't want to be the oddball...

ericcurtin · 2025-02-03T19:42:56Z

container-images/build_llama_and_whisper.sh

+  local vulkan_rpms=("vulkan-headers" "vulkan-loader-devel" "vulkan-tools" "spirv-tools" "glslc" "glslang")
+  local intel_rpms=("intel-oneapi-mkl-sycl-devel" "intel-oneapi-dnnl-devel" "intel-oneapi-compiler-dpcpp-cpp" "intel-level-zero" "oneapi-level-zero" "oneapi-level-zero-devel" "intel-compute-runtime")
+
+  LLAMA_CPP_CMAKE_FLAGS+=("-DGGML_CCACHE=OFF" "-DGGML_NATIVE=OFF" "-DBUILD_SHARED_LIBS=NO" "-DLLAMA_CURL=ON")


This isn't right, we want BUILD_SHARE_LIBS for llama.cpp but not whisper.cpp

I think -DBUILD_SHARED_LIBS breaks for the intel-gpu build. I'll double check that.

I bet that's why the rocm image got really big... -DBUILD_SHARED_LIBS=NO. That resulted in some huge executables in /usr/bin...

Yeah, I think you got it, -DBUILD_SHARED_LIBS=NO end up in the libraries being duplicated in every single binary produced.

The reason we do it for whisper.cpp is it uses some of the same libraries, with the same names, but with different versions sometimes, so one install will replace the library of the other, breaking whichever was installed first.

whisper.cpp is a small project so we statically link that one to fix the issue.

ericcurtin · 2025-02-03T19:44:50Z

container-images/build_llama_and_whisper.sh

+  CMAKE_FLAGS+=(${LLAMA_CPP_CMAKE_FLAGS[@]})
+  cloneAndBuild https://github.com/ggerganov/llama.cpp ${llama_cpp_sha} ${install_prefix}
+  dnf -y clean all
+  rm -rf /var/cache/*dnf* /opt/rocm-*/lib/*/library/*gfx9*


This was where the bulk of the trimming was done for rocm in the past. Not sure why it's ballooned again.

I'd revert this shell script back to it's original state and introduce just enough changes to get "intel-gpu" working with this script. There's no need to rewrite the whole script, it's asking for breakages.

It's not making a reviewers job easy re-writing the whole thing 😄

Understood. It was a wild hair 🤓

My intent is to see if we can prevent the build script from becoming a Rube Goldberg machine. The more logic that we have to drop into places based on different builds may get unmanageable. IMO, it's already a bit challenging to read. Natural entropy of code.

Not sure my attempt is much easier to read either though...

ericcurtin · 2025-02-03T19:47:02Z

container-images/build_llama_and_whisper.sh

@@ -0,0 +1,113 @@
+#!/usr/bin/env bash
+
+# Bash does not easily pass arrays as a single arg to a function.  So, make this a global var in the script.


We don't need to pass local variables as parameters in bash, they propagate to called functions, they just don't have global scope if that makes sense.

ericcurtin · 2025-02-03T19:50:03Z

Besides the build_llama_and_whisper.sh file, the changes look fine.

Signed-off-by: Charro Gruver <[email protected]>

cgruver · 2025-02-04T12:39:27Z

I'm going to dump this one. It was an experiment.

Refactor image builds to put all of the hw/platform switch logic into…

df456ae

… one place. Signed-off-by: Charro Gruver <[email protected]>

cgruver requested review from rhatdan, ericcurtin, bmahabirbu, maxamillion, dougsland, swarajpande5, jhjaggars, slp and engelmi as code owners February 3, 2025 13:04

sourcery-ai bot reviewed Feb 3, 2025

View reviewed changes

cgruver marked this pull request as draft February 3, 2025 13:18

fix cmake flags for differences between llama.cpp and whisper.cpp

38fc2b4

Signed-off-by: Charro Gruver <[email protected]>

cgruver marked this pull request as ready for review February 3, 2025 14:21

sourcery-ai bot reviewed Feb 3, 2025

View reviewed changes

modify the order of UBI installs to get the correct package sources

2570261

Signed-off-by: Charro Gruver <[email protected]>

ericcurtin reviewed Feb 3, 2025

View reviewed changes

cgruver marked this pull request as draft February 3, 2025 21:25

cgruver and others added 2 commits February 3, 2025 21:41

remove from llama.cpp build

52600c1

Signed-off-by: Charro Gruver <[email protected]>

Merge branch 'containers:main' into refactor-image-build

cbe8581

cgruver closed this Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor image builds to put all of the hardware and platform selection logic into one place. #720

Refactor image builds to put all of the hardware and platform selection logic into one place. #720

cgruver commented Feb 3, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 3, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

cgruver commented Feb 3, 2025

sourcery-ai bot left a comment

sourcery-ai bot left a comment

cgruver commented Feb 3, 2025

ericcurtin Feb 3, 2025

ericcurtin Feb 3, 2025 •

edited

Loading

cgruver Feb 3, 2025

ericcurtin Feb 3, 2025 •

edited

Loading

cgruver Feb 3, 2025

cgruver Feb 3, 2025

ericcurtin Feb 4, 2025

ericcurtin Feb 3, 2025 •

edited

Loading

cgruver Feb 3, 2025

cgruver Feb 3, 2025

cgruver Feb 3, 2025

ericcurtin Feb 3, 2025 •

edited

Loading

ericcurtin commented Feb 3, 2025

cgruver commented Feb 4, 2025

		@@ -0,0 +1,113 @@
		#!/usr/bin/env bash

		# Bash does not easily pass arrays as a single arg to a function. So, make this a global var in the script.

Refactor image builds to put all of the hardware and platform selection logic into one place. #720

Refactor image builds to put all of the hardware and platform selection logic into one place. #720

Conversation

cgruver commented Feb 3, 2025 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Feb 3, 2025 • edited Loading

Reviewer's Guide by Sourcery

Flow diagram of the new build process

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

cgruver commented Feb 3, 2025

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot left a comment

Choose a reason for hiding this comment

cgruver commented Feb 3, 2025

Choose a reason for hiding this comment

ericcurtin Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericcurtin Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericcurtin Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericcurtin Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

ericcurtin commented Feb 3, 2025

cgruver commented Feb 4, 2025

cgruver commented Feb 3, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 3, 2025 •

edited

Loading

ericcurtin Feb 3, 2025 •

edited

Loading

ericcurtin Feb 3, 2025 •

edited

Loading

ericcurtin Feb 3, 2025 •

edited

Loading

ericcurtin Feb 3, 2025 •

edited

Loading