-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ubuntu-precompiled] enable offline installation of driver packages #222
base: main
Are you sure you want to change the base?
Conversation
@@ -48,28 +48,21 @@ RUN if [ -n "${CVE_UPDATES}" ]; then \ | |||
rm -rf /var/lib/apt/lists/*; \ | |||
fi | |||
|
|||
# update pkg cache and install pkgs for userspace driver libs | |||
RUN apt-get update && apt-get install -y --download-only --no-install-recommends nvidia-driver-${DRIVER_BRANCH}-server \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where else was apt-get remote or apt-get clean
run after this step which was causing the packages to be removed? apt-get update
itself doesn't remove downloaded packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when I downloaded packages using apt-get install --download-only
and then ran an update apt-get update
. I noticed that the downloaded packages in /var/cache/apt/archives
were wiped out
nvidia-kernel-source-${DRIVER_BRANCH}-server \ | ||
xserver-xorg-video-nvidia-${DRIVER_BRANCH}-server | ||
# Install necessary driver userspace packages | ||
apt-get install -y --no-install-recommends \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will just install utils like nvidia-smi, nvidia-persistenced, nvidia-mps-control
etc, but not other required libs like encoder/decoder/gl/cuda libs etc. There are ton of other user space packages that we need. where are those installed as dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the encoder/decoder/gl/cuda libs
are required, I can install them. I wasn't aware that they were needed.
I validated the current container with mps and tests like gpu-burn. They worked fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the other packages. Please check again
1148aa7
to
5617fe0
Compare
Signed-off-by: Tariq Ibrahim <[email protected]>
5617fe0
to
b22109d
Compare
This PR introduces a completely new method of installing the nvidia precompiled driver packages.
Motivation
We recently discovered that our offline installs weren't exactly "offline". During the driver container run-time, the driver container was still downloading packages externally. The root cause of this was the multiple
apt-get update
calls, which would erase the previously downloaded packages from theapt-get install -y --no-install-recommends --download-only <packages>
command.Summary of changes:
i) Download the packages and its dependencies using the following commands
ii) Move to a permanent location and gzip the apt-get downloaded packages.
iii) Create a local apt package source pointing to the new directory with the downloaded packages
NOTE: In this PR, I also move away from installing the giant metapackage
nvidia-drivers-${DRIVER_VERSION}-server
and purging unneeded packages thereafter. Instead, we just install the packages that we actually need instead of worrying about any bloat