Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impr(optimization/refactor): Optimized Debian dockerfiles, Enhanced readability + Enhanced code structure. (@Ilolm) #1948

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ilolm
Copy link

@ilolm ilolm commented Nov 8, 2024

Description

1.Reduced layeres amount.
2.Enhanced readibility.
3.Enhanced dockerfile sctructure(Moved variable in one place instead of being scattered on the rest of the code). 3.Fixed some bugs such as lowercase 'as'.

Testing done

1. Manually built an edited Dockerfile image successfully.

docker image built --no-cache -t test_feature .

2. Ran the image successfully and checked the web ui.

docker container run --rm -p 8080:8080 -p 50000:50000 test_feature

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

1.Reduced layeres amount.
2.Enhanced readibility.
3.Enhanced dockerfile sctructure(Moved variable in one place instead of being scattered on the rest of the code).
3.Fixed some bugs such as lowercase 'as'.
@ilolm ilolm requested a review from a team as a code owner November 8, 2024 16:19
@ilolm ilolm changed the title impr(optimization/refactor): Optimized, Enhanced readability + Enhanced code structure. (@Ilolm) impr(optimization/refactor): Optimized Debian dockerfiles, Enhanced readability + Enhanced code structure. (@Ilolm) Nov 8, 2024
@dduportal dduportal self-assigned this Nov 8, 2024
Copy link
Contributor

@dduportal dduportal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this proposal!

A first pass of superficial review:

  • This PR does a LOT of things. It's really hard, as a maintainer committed to have a safe and sustainable image, to review and evaluate all risks. It's not immediate which change solves which problem. Do you mind splitting in different PRs, each one solving a single problem (so easier to review, faster to merge when no discussion is needed)? Or at least open distinct issues explaining the problem and the solving strategy so we could relate and discuss (instead of having what is going to be a really big PR slow to review and merge with tons of discussions)?

  • Grouping ARG and ENV does not change the layering at all, because it's only a set of metadata (docker build does not create any layer with these instructions). Besides, it introduces big downsides in this repository:

    • It breaks our updateclihttps://www.updatecli.io/ manifests system which tracks values => can you change it to, at least, distinct instructions (unless it solves a particular issue)?
    • It breaks the Docker caching when we update only a couple of "ARG + RUN" such as the jenkins-plugin cli. The "scattering" is sometimes a strong functional feature in Dockerfile to benefit from layer caching, while grouping on a single location is a syntactic sugar for code style. => Is there an issue you want to solve other than code style?"
  • Moving the LABEL on top is breaking the values (as it uses ARG values which are defined later in the scopes).

  • Mixed feelings about the COPY instruction in its array form. It does not reduce layers in the context of docker buildx, but it simplifies the copy. It feels a source of confusion as it requires higher skills in Linux + Docker skills. I'm torn between the simplification vs. making it harder to read and contribute

Some changes make a lot of sense:

  • Cleaning up apt after the git-lfs install
  • Having the VOLUME definition as last element makes sense because it avoid mistakes when writing / changing the associated mountpoint
  • Grouping EXPOSE instructions
  • Fixing the lower case keywords (not bugs: it's working as expected, but since Docker now prints warning about this, let's fix it)
  • Moving USER to the end to avoid errors with permissions

Some changes does not make sense for me (as individual maintainer: noit a general rule):

  • Adding echo instructions in the RUN instructions is a Dockerfile code smell. it means you want to locate which instruction causes failure (when failing to build). It means you're trying to create a big layer with a LOT of moving pieces, which defeats the purpose of container images caching. Besides, the "echo" adds instruction making it heavier to read and understand.
  • Adding executable bit on copied shell script should be avoided in favor of calling the shell directly with the script path as argument (personal preference in the context of Dockerfile in which we control the shell so shebang should not matter)
  • The comments should be turned in to documentation: they are not following homogeneous convention and are repeating the code without providing any information

@dduportal
Copy link
Contributor

Additionally:

  • Is there a reason to have done these changes only on the debian images?
  • The linter (hadolint) fails (as per the CI checks) which is a prerequisite

@ilolm
Copy link
Author

ilolm commented Nov 9, 2024

Hi there!
Firstly I highly appreciate your review and notices made about the changes.


This PR does a LOT of things. It's really hard, as a maintainer committed to have a safe and sustainable image, to review and evaluate all risks. It's not immediate which change solves which problem. Do you mind splitting in different PRs, each one solving a single problem (so easier to review, faster to merge when no discussion is needed)? Or at least open distinct issues explaining the problem and the solving strategy so we could relate and discuss (instead of having what is going to be a really big PR slow to review and merge with tons of discussions)?

Okay, I will take this into account.


Grouping ARG and ENV does not change the layering at all, because it's only a set of metadata (docker build does not create any layer with these instructions). Besides, it introduces big downsides in this repository:

The main purpose of this was to made the code structure more readable by moving each piece of code into his own section. However if it breaks the code or that was made specifically for caching purposes, I will modify those things back in my repo.


Moving the LABEL on top is breaking the values (as it uses ARG values which are defined later in the scopes).

Yeah, that's definitely my bad, sorry about that. It's gonna be fixed as well.


Mixed feelings about the COPY instruction in its array form. It does not reduce layers in the context of docker buildx, but it simplifies the copy. It feels a source of confusion as it requires higher skills in Linux + Docker skills. I'm torn between the simplification vs. making it harder to read and contribute

As for me it looks really self-explanatory but if you want, I can change it back as well :)


I also agree with your personnel notices --> so, there are gonna be fixed as well.


Is there a reason to have done these changes only on the debian images?

No, I was going to make the same task with the alpine images later.


The linter (hadolint) fails (as per the CI checks) which is a prerequisite

Yeah, I see. I'm gonna fix it as well

@ilolm ilolm force-pushed the dockerfile_optimization branch from eb6c517 to 2c2ef2c Compare November 9, 2024 10:30
Changes made in both debian Dockerfiles:
1.Moved Variables back to their possitions for caching advantage.
2.Moved needed variables for lables on the top.
3.Split some RUN statements back for caching advantage.
4.Enhanced comments.
@ilolm ilolm force-pushed the dockerfile_optimization branch 2 times, most recently from cf133d0 to d4907ca Compare November 9, 2024 11:09
@ilolm
Copy link
Author

ilolm commented Nov 9, 2024

Hadolint fix is in progress

@ilolm ilolm force-pushed the dockerfile_optimization branch from d4907ca to 8be5443 Compare November 9, 2024 12:47
@ilolm
Copy link
Author

ilolm commented Nov 9, 2024

hanolint couldn't parse combined ENV/ARG statements, that's why it was failing

@ilolm
Copy link
Author

ilolm commented Nov 9, 2024

Summarizing changes:

  • Changed 'as' lowercase to 'AS' uppercase.
  • Combined COPY statements into an array.
  • Added 'Cleaning up' sections in RUN statements containing apt-get operations.
  • Moved LABEL statement on the top with some required variables above.
  • Moved VOLUME in the end.
  • Moved USER in the end before ENTRYPOINT.
  • Combined EXPOSE statements and moved in the end.
  • Modified comments.

@ilolm ilolm requested a review from dduportal November 9, 2024 13:44
@dduportal
Copy link
Contributor

Thanks @ilolm ! I'm a bit busy these time, so expect some delay in the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants