-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline rootfs diff - DRAFT #24984
base: main
Are you sure you want to change the base?
Inline rootfs diff - DRAFT #24984
Conversation
The Go standard library supports copy_file_range for I/O between files. For most file systems, this is nice speed up, because it avoids copying data. For certain filesystems, like BRTFS and XFS, CoW causes such writes to be metadata-only, speeding them up by 10x or so. In principle, uncompressed tar files can be read and written using copy_file_range, as the file format does not checksum the data (see golang/go#70807). Currently, podman makes this impossible, as the archive package liberally inserts buffers and pipes. This commit marks a couple of places that should be revisited.
* tarAppender did not append to tar files * createTarFile did not create tar files * addTarFile did not add tar files
This function takes a *tar.Writer as argument, and does not close the *tar.Writer. This has the following advantages: * obviates the io.Pipe intermediary, reducing the amount of copying. * error handling is more straightforward. * provides a mechanism to package multiple `srcRoot`s into a single tar stream.
This is the unpacking counterpart to `TarWithOptionsTo`. It lets us unpack a single tar stream in different sections.
This yields a 2x speed up on creating and restoring snapshots, as it avoids creating and extracting the temporary rootfs-diff.tar file. Tested with the following recipe $ cat files.sh #!/bin/sh dd if=/dev/urandom of=test bs=1M count=100 for i in $(seq 1 10); do cp test test$i & done wait rm /usr/bin/znew $ sudo ~/vc/containers/podman/podman run -dt -p 8080:80/tcp docker.io/library/httpd; \ sudo ~/vc/containers/podman/podman container cp files.sh $(sudo ~/vc/containers/podman/podman ps -l --format "{{.ID}}"):files.sh ;\ sudo ~/vc/containers/podman/podman container exec -it -l /bin/sh -x files.sh ;\ (cd ../containers/podman/ && go build -tags "selinux seccomp" ./cmd/podman) && sudo ~/vc/containers/podman/podman container checkpoint -e foo.tar -l && \ sudo ~/vc/containers/podman/podman container restore -i foo.tar --name restored For productionizing, this would need * supporting (de)compression, which I left out for simplicity * a decision on what to do with backward compatibility * similar treatment of the devshm and volumes tar files * a more principled approach to storing multiple sections in a single tar file? * upstreaming changes to the archive package in the storage repo (including tests) Addresses containers#24826
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: hanwen-flow The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
So the biggest change seems to be in the code that is vendored-in. As you mention in the commits these changes need to be made in the corresponding upstream projects. Is that your plan? |
I wanted to check the plan with you before I go ahead. What do you think of the plan? I also wanted to show an alternative which doesn't require format changes but optimizes only the restore process. That has fewer support/API implications, and might be less controversial for that reason |
This is a draft PR as discussed in #24826, so it's missing some of the regalia (tests, signed off lines etc.) of a production PR