-
-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(docker): Checkov installation silently fails on docker build
in arm64. Workaround till issue will be fixed in checkov
itself
#635
Conversation
…on which bumps rustworkx >0.14.0 this adds rust, cargo and keeps gcc to allow source compile for aarch64.
…th checkov bin during build stage
@antm-pp Would it make sense 1) to leave corresponding comments in the Dockerfile for tracking and 2) to try and employ |
@yermulnik I am open to direction. I'm not an expert in contributing, just found a solution and wanted to post it. It looked like the cffi library was doing the same in the Dockerfile, so I adopted the same approach, without TARGETARCH. I can only comment for arm64 on Mac, don't have the facilities to test other variations, although I can see the pipeline builds multi-arch so could verify both paths if I update the PR I guess. gcc is installed in the final image anyway, it's only the builder that installs then removes it. Likewise the significant time imapct during build only applies if your arch doesn't have a pre-compiled rustworkx, so I don't think we're causing any issues in amd64 with the current approach. As I said though, happy to take direction. |
Oh, I did misread the 2nd part of the «The final image is inflated from ~980MB to ~1.68GB presumably by Checkov now being present» sentence and thought the size change was related to rust/cargo being added 🤦🏻 Still is odd that Checkov almost doubles the size of the final image 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antm-pp Please take a look into Dockerfile linter notice and see whether you can have container image built with Checkov if you follow what linter suggests. Thanks for the contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I'm looking like a nerd now, though as of look of it the list of packages was alphabetically sorted before this change. @antm-pp Sorry I didn't pay attention to this before.
The PR looks good to me though. Hence approved, though awaiting for Max to review.
Apologies, obviously the repo has pre-commit for itself, I didn't employ it locally first. Now passing! The instructions for contributing were quite hook heavy and not too much about PR process. I noticed in the GHA multi-arch build actually the compile of rust failed due to a cargo issue. This didn't happen to me locally. I'm just running a linux/arm64 build explicit now to see if that fails for me like it did in the pipeline. |
Dockerfile
Outdated
@@ -66,10 +66,10 @@ RUN if [ "$INSTALL_ALL" != "false" ]; then \ | |||
RUN . /.env && \ | |||
if [ "$CHECKOV_VERSION" != "false" ]; then \ | |||
( \ | |||
apk add --no-cache gcc=~12 libffi-dev=~3 musl-dev=~1; \ | |||
apk add --no-cache gcc=~12 libffi-dev=~3 musl-dev=~1 libgcc=~12 rust=~1 cargo=~1; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions: Did you try to remove each of these dependencies, build an image, and confirm that is a minimal setup?
Please add check for GCC in https://github.com/antonbabenko/pre-commit-terraform/blob/master/.github/.container-structure-test-config.yaml
Also, I think that we need to find a way to run these tests on arm64 too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is minimal dependencies. So gcc was already present in the failing build, as used for same approach by cffi compile. Tried adding only rust, and it errored cargo still missing.
Guidance for rustworkx indicates need for compiler including rust and cargo (or to use rustup a cross platform installer).
Noticed when the pre-existing purge of gcc occurred it caused exception in running checkov in the build container, therefore applied libgcc seperately to minimise that dependency and it executes ok (for the version check). When referring to the final image I noted that gcc in full is already a dependency for the pre-commit hooks. I couldn't see obviously where it's documented what hook that dependency is for. I could add a further comment to highlight it's at least needed for checkov.
Happy to add a gcc check as requested, although I've not added the gcc dependency in the final image.
Just to note, the linux/arm64 builds have been failing for sometime, an example from 2 months ago: https://github.com/antonbabenko/pre-commit-terraform/actions/runs/7183944518/job/19563861227#step:9:741
I'm just running through some tests, it looks like the linux/arm64 build is failing because it can't pull crates.io. I tried one recommended test saying to set env-var CARGO_NET_GIT_FETCH_WITH_CLI=true which then created a dependency on git (which again is already in the final image but not the builder). Just trying a run with that, so that darwin/arm64 and linux/arm64 can both compile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add a gcc check as requested, although I've not added the gcc dependency in the final image.
Ah, really? I see removal of rust and cargo, but not libgcc. How it works then 🤔
If there no package at the end - then there nothing to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add a gcc check as requested, although I've not added the gcc dependency in the final image.
Ah, really? I see removal of rust and cargo, but not libgcc. How it works then 🤔
Apologies if I'm missing some bits of context: gcc
!= libcc
. The former is the compiler collection and the latter is runtime libs only — https://pkgs.alpinelinux.org/package/edge/main/x86/gcc vs https://pkgs.alpinelinux.org/package/edge/main/x86/libgcc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, we can't just add tests for macos, as Structure Test currently not support macos
#636
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think the issue is the structure of the '||' command for the install. When the rust compiler call fails it generates a false allowing the other part of the command to run (intended for managing checkov==latest vs checkov==). The 2nd part has its own failure mode that doesn't actually generated an exit1. So the error gets buried and the build passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So all these combinations should be rewritten to vanilla if-then-else statements? @yermulnik
pre-commit-terraform/Dockerfile
Lines 70 to 71 in c29bdb1
[ "$CHECKOV_VERSION" = "latest" ] && pip3 install --no-cache-dir checkov \ | |
|| pip3 install --no-cache-dir checkov==${CHECKOV_VERSION}; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yep, that sort of "ternary" in Bash is not vanilla if/else
and has this kind of "discomfortable" hiccups 😿
And, yes, these need to be re-written 😢 Either using vanilla if/else
, or like below:
[ "$CHECKOV_VERSION" = "latest" ] && pip3 install --no-cache-dir checkov; \
[ "$CHECKOV_VERSION" != "latest" ] && pip3 install --no-cache-dir checkov==${CHECKOV_VERSION}; \
ps: I probably can try and do that, though I will need help building it and testing resulting images (@MaxymVlasov, that would be super great if you already had that automation so that I can push changes and you test build/run 🤪).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or like below 🤔
[ "$CHECKOV_VERSION" = "latest" ] && CHECKOV_VERSION="" || CHECKOV_VERSION="==${CHECKOV_VERSION}"; \
pip3 install --no-cache-dir checkov${CHECKOV_VERSION}; \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yermulnik done in this PR. Also, I can confirm that without @antm-pp changes to rust-cargo, it fails in arm64, when with - checks passes.
And now we have prevention of silent fails for checkov - #635 (review)
@antm-pp When you have a chance that would be great if you could contribute to that docu from the contributor point of view 😺 |
@MaxymVlasov re PR process : Still not sure what the right way is, only which bits I was told were wrong 🤣
|
@antm-pp That's the default GitHub workflow. We can't provide Write access to everyone who has a GH account. That's technically impossible (at least, was). Branch creation requires write access, AFAIK. If I miss some new GH feature - please point me to docs - I can't able to quickly find such info. Also, there is a probability, that such a feature is available only for Enterprise customers or in private beta. |
@MaxymVlasov no problem. My lack of knowledge then, literally my first public contribution. Only worked in private repos before, wasn't aware of the limitation. Assumed wrongly that it was a setting rather than a system limitation. Thanks for the correction. |
Yep, forking and PR'ing back is the common approach that allows arbitrary people to not have a write permission in the target repo (which allows a way too much). As Max already mentioned this is common approach with public repos. This is the contribution howto by GH: https://docs.github.com/en/get-started/exploring-projects-on-github/contributing-to-a-project
You've hit a drawback of pre-commit (as a whole) framework: one can skip it =( So this is more of a trust to contributors to not bypass this soft requirement =) ps: hope you're not suffering from all this stuff given it's your first public contribution (which — your contribution — is a way much more better than a dozens of others I've seen across different repos lately). thanks for your effort and time. we do appreciate this 👍🏻 |
So I am seeing some quite strange behaviour. When I docker buildx (I'm running colima which runs ubuntu in qemu): default (no target args, no --platform) or --platform linux/aarch64; then my original build (just adding rust, cargo) runs fine and completes locally. However, with --platform linux/arm64 the build fails saying that during the cargo process for compiling it can't pull the crates.io index (it reports as a network/proxy issue) In all 3 cases the container reports uname==aarch64, $TARGETOS==linux, $TARGETARCH==arm64 My limited understanding being that aarch64 is an alias of arm64 as a docker platform type. The only guess I can make is that in some way colima or the docker process on my machine creates the container in a more 'native' environment as aarch64 but seems to think it needs a custom virtual host with some odd networking for building arm64. Certainly the inside of the container seems to be consistent for both. Whatever that issue is locally, was also present in the GHA/workflow build of linux/arm64. Adding the git dependency and the ENV VAR to use cli git fetch for cargo seems to give a consistent success for me locally. So will commit that to this PR to valid in the GHA workflow. I've added a simple gcc check. There isn't much detail in alpine's gcc --verison, but I've regexed on the 'gcc (Alpine 12.' it gives version verification to the major we've pinned, and is string locked to the version output (rather than only matching gcc which would also match a 'gcc not found' type stdout too. I did look to test locally, but don't have a personally image repo I can send an image of that size too and recall with the container test action. I verified the regex with a grep -E inside a build container though. Also added some notes around the package dependencies (sort of assumed what the cffi ones are from my read of that compilation). This allows them to be alphabetical, but still clear which dependency is for which compilation. |
After hours of re-building containers taking 15min at a time I think the whole aarch64 vs arm64 thing is nonsense (as it should be they're the same thing). I think crates.io had an outage, and pulling from their github via the env var just changed the source. Although they've reported no issues today, they had an extended outage with symptoms like I saw here (index unavailable) on Feb 14th. https://status.crates.io/ I'm now able to build this current PR successfully with no additions. Could one of you rerun the existing docker build job action? Run 2 as I think it'll compile fine, and demonstrate the same issue has cleared inside the github runner with no further code change. |
Checkov
silently fails on arm64. Workaround till issue will be fixed in checkov
itself.
then pip3 install --no-cache-dir checkov || exit 1; \ | ||
else pip3 install --no-cache-dir checkov==${CHECKOV_VERSION} || exit 1; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exit 1
prevents silent fail of error: can't find Rust compiler
When the rust compiler call fails it generates a false
#635 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we can't just add comments inside the code about it. Probably, the best place is on L73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest why at # Install pre-commit
code block (lines 20-23) pip3
has no || exit 1
bits? Is ii intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because pre-commit doesn't use rust?
I prefer somehow make structured tests work for arm64, rather than add exit 1 in every possible place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because pre-commit doesn't use rust?
That's kind of weird: if pip3 install ...
fails (as in returns non-zero code) then exit the shell with non-successful code (exit 1
). And this is implemented for checkov
install only, which means it's okay to not exit for the same e.g. with pre-commit
install by pip
. Does pip
behave differently if it fails to install pre-commit
? 🤔 Aint't Docker's RUN
imply somewhat set -e
? Should we try and add set -e
to each RUN
to explicitly let Docker RUN
exit when any downstream command fails within if/else
statement? 🤔 I'm a bit lost to be honest 😲 We definitely should use consistent solution for all similar expressions for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rewrite it to bash scripts.
What do you think of a single script that takes args? Like install_deps.sh pre-commit
, install_deps.sh checkov
, install_deps.sh pre-commit checkov ...
, or install_deps.sh ALL
? Just to keep code in the same place and have re-usable snippets (like shell funcxtions) 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem there that they are different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, I just moving it out and implementing logic as
COPY tools/install/ /install/
WORKDIR /bin_dir
RUN /install/pre-commit.sh
RUN /install/foo.sh
We can discuss it in next PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem there that they are different
Some are slightly different, whilst others use the same solutions (like extracting download URLs from GH releases API response). And also I see it odd to have a bunch of almost-the-same ten-lines shell scripts instead of one that can handle installation of all the deps one-by-one or all-at-once.
On the other hand such approach with single script would negatively impact build caching and each RUN
layer would get rebuilt if the file is updated with a change to the installation steps of a specific dep =(
From this point if view I'd better stay with the current approach =)
We can discuss it in next PR :)
Makes sense 🤝 (apologies that I already outlined my thoughts — this helps me imprint them in memory 😺)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxymVlasov do not install distributions with pip separately because it will not take all things installed in the previous session into account when you run a new install command. Enumerate everything and let the dependency resolver know all your requirements together. Ideally, use pip-compile to produce and commit constraint files (lockfiles) and invoke it via pip install -r direct-deps.txt -c constraint.txt
.
If you do run separate pip installs, inject a pip check
invocation at the end to verify integrity.
Checkov
silently fails on arm64. Workaround till issue will be fixed in checkov
itself.checkov
itself.
checkov
itself.checkov
itself
checkov
itselfdocker build
in arm64. Workaround till issue will be fixed in checkov
itself
docker build
in arm64. Workaround till issue will be fixed in checkov
itselfdocker build
in arm64. Workaround till issue will be fixed in checkov
itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probs https://github.com/antonbabenko/pre-commit-terraform/pull/635/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R20-R23 needs || exit 1
bits too? (see comment below: #635 (comment))
ps: approved just in case || exit 1
isn't needed there.
then pip3 install --no-cache-dir checkov || exit 1; \ | ||
else pip3 install --no-cache-dir checkov==${CHECKOV_VERSION} || exit 1; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest why at # Install pre-commit
code block (lines 20-23) pip3
has no || exit 1
bits? Is ii intentional?
## [1.88.1](v1.88.0...v1.88.1) (2024-03-11) ### Bug Fixes * **docker:** Checkov installation silently fails on `docker build` in arm64. Workaround till issue will be fixed in `checkov` itself ([#635](#635)) ([f255b05](f255b05))
This PR is included in version 1.88.1 🎉 |
Put an
x
into the box if that apply:Description of your changes
Pending a new version of checkov (which has been requested), which bumps rustworkx >0.14.0.
This temporarily adds rust, cargo during checkov to allow [email protected] to compile, similar to how gcc etc are already added to compile cffi for similar reason (lack of musl aarch64).
Testing removal of all items after compile highlighted checkov exception due to missing gcc lib and therefore PR keeps gcc installed. It's possible that only the specific lib is needed and could be much smaller. In hindsight this was in the builder image, not in the final image, so not sure the reasoning behind trying to tidy up.
The impact of the change:
Installing packages takes about 55 seconds with or without rust/cargo so they're not adding significant time.
Compiling rustworkx takes about 330 seconds (only affecting aarch64 where not pre-compiled, and failing otherwise
The final image is inflated from ~980MB to ~1.68GB presumably by Checkov now being present. Not sure how this compares to the image size on x86_64.
Fixes #634
Fixes #633
bridgecrewio/checkov#5608
Qiskit/rustworkx#992 (comment)
Qiskit/rustworkx#1008
How can we test changes
I have built this docker image locally on MacOS M2 Max (aarch64) and am successfully able to get Checkov installed. Nothing in these minor changes would be expected to impact the behaviour of a non-arm architecture build.