-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for zopfli gzip archives #80
Comments
I think switching to zlib-ng (which is much faster, thus allowing you to use a more aggressive profile and still get better performance) or an alternate format such as zstd would be far superior options, at this point. Those libraries would also provide some decompression speed benefits as well. Zopfli only helps with compression. |
That works for me |
Worth closing in favor of #82 ? |
The major advantage of |
I'm told that DNF has supported it for a few years (since roughly EL 8.2 & Fedora 30) which means the only gap is on the repo-metadata-generation side. Of course EL7 won't have that option, but looking forward it's not as much of a concern - and in any case EL7 prefers sqlite metadata if available which uses BZ2 compression by default. |
Now that createrepo_c supports |
Basically the only function provided by zopfli is:
Provided the function supports stream-oriented operations, it shouldn't be a problem to use it from createrepo_c. However, I did not study the implementation, from the comment, it's not clear how it reports memory allocation failures. Also a glance at a queue of open issues https://github.com/google/zopfli/issues is not a sign of a safe project. Regarding the compression ratio, I haven't seen any results. Today Fedora 42 primary.xml:
primary.zst taken was taken from the repository, primary.gzip compressed with "-9" option, primary.zopfli with default iterations (--15 according to help output). The zopfli compression took an enormous time, so nobody will probably use more iterations. With this settings, zopfli saves 5 % of gzip. I don't think implementing zopfli backend for gzip compression is worth of it. Pat, do you have better comparison? |
You're seeing similar results to what I'm getting. For Scientific Linux's frozen content ( For things that are write once and update never, the savings can be worth the trade off. That being said, if |
10 per cent is interesting number. Though I worry it wouldn't be applicable to createrepo_c with the current architecture. Current createrepo_c sends the data for compression in, probably small, chunks. Good compression needs scanning all data at once. One would probably need to change crearerepo_c's architecture of saving files. But then people with large repositories and not so large internal memory could object that createrepo_c cannot run. So the new architecture would need to support both approaches. That's getting awfully complicated only to improve storage space for end-of-life systems. |
repodata is an ideal candidate for zopfli as it is compressed once and mirrored a lot. The compressed files are stream compatible with
gzip
and end users should only notice the files are themselves smaller.Can an option be added to createrepo_c to utilize zopfli instead of gzip for
.gz
files.https://github.com/google/zopfli
https://koji.fedoraproject.org/koji/packageinfo?packageID=16157
The text was updated successfully, but these errors were encountered: