Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for more digest functions #715

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

AlessandroPatti
Copy link
Contributor

@AlessandroPatti AlessandroPatti commented Dec 10, 2023

Generalize the cache to support multiple digest functions. This PR updates the remote_execution proto definitions and introduces a new interface hashing.Hasher that can be used to compute, validate as store blobs for a given DigestFunction.

All blobs are stored as <kind>/<digest function>/<path>, <kind> being one of cas, cas.v2, ac or raw, <digest function> being the name of the digest function (e.g. blake3) and <path> being the current blob, sharded by prefix (e.g. f1/f1...). The exception to this rule is sha256, for which we drop the <digest function> component to maintain full backwards compatibility.

This change does not add any additional function, but they can be added as needed by adding a new type that implements hashing.Hasher and registering it with hashing.register (tested with sha1 as well, but did not include it in this PR).

In order to keep supporting other instances of bazel remote as backend proxy with the new digest functions, we additionally now set and get the X-Digest-Function header in each request.

Fixes #710

@AlessandroPatti AlessandroPatti force-pushed the apatti/710/blake3 branch 9 times, most recently from 724429f to 7bae689 Compare December 10, 2023 14:17
@mostynb
Copy link
Collaborator

mostynb commented Dec 11, 2023

I have a few concerns about landing this feature, but since you have a prototype is it something we can get some benchmark numbers for? How much does switching from sha256 to blake3 improve build times?

@AlessandroPatti
Copy link
Contributor Author

AlessandroPatti commented Dec 13, 2023

What are your concerns? There's a benchmark that can be run with bazel run cache/hashing:go_default_test -- --test.bench ., I'll include my results here:

Linux x86_64
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
BenchmarkHashers/1B_BLAKE3-16         	1000000000	         0.0000057 ns/op
BenchmarkHashers/1B_SHA256-16         	1000000000	         0.0000018 ns/op
BenchmarkHashers/2B_BLAKE3-16         	1000000000	         0.0000048 ns/op
BenchmarkHashers/2B_SHA256-16         	1000000000	         0.0000015 ns/op
BenchmarkHashers/4B_BLAKE3-16         	1000000000	         0.0000044 ns/op
BenchmarkHashers/4B_SHA256-16         	1000000000	         0.0000011 ns/op
BenchmarkHashers/8B_BLAKE3-16         	1000000000	         0.0000030 ns/op
BenchmarkHashers/8B_SHA256-16         	1000000000	         0.0000014 ns/op
BenchmarkHashers/16B_BLAKE3-16        	1000000000	         0.0000037 ns/op
BenchmarkHashers/16B_SHA256-16        	1000000000	         0.0000009 ns/op
BenchmarkHashers/32B_BLAKE3-16        	1000000000	         0.0000062 ns/op
BenchmarkHashers/32B_SHA256-16        	1000000000	         0.0000009 ns/op
BenchmarkHashers/64B_BLAKE3-16        	1000000000	         0.0000036 ns/op
BenchmarkHashers/64B_SHA256-16        	1000000000	         0.0000011 ns/op
BenchmarkHashers/128B_BLAKE3-16       	1000000000	         0.0000091 ns/op
BenchmarkHashers/128B_SHA256-16       	1000000000	         0.0000018 ns/op
BenchmarkHashers/256B_BLAKE3-16       	1000000000	         0.0000031 ns/op
BenchmarkHashers/256B_SHA256-16       	1000000000	         0.0000009 ns/op
BenchmarkHashers/512B_BLAKE3-16       	1000000000	         0.0000052 ns/op
BenchmarkHashers/512B_SHA256-16       	1000000000	         0.0000013 ns/op
BenchmarkHashers/1KB_BLAKE3-16        	1000000000	         0.0000058 ns/op
BenchmarkHashers/1KB_SHA256-16        	1000000000	         0.0000019 ns/op
BenchmarkHashers/2KB_BLAKE3-16        	1000000000	         0.0000155 ns/op
BenchmarkHashers/2KB_SHA256-16        	1000000000	         0.0000030 ns/op
BenchmarkHashers/4KB_BLAKE3-16        	1000000000	         0.0000157 ns/op
BenchmarkHashers/4KB_SHA256-16        	1000000000	         0.0000047 ns/op
BenchmarkHashers/8KB_BLAKE3-16        	1000000000	         0.0000235 ns/op
BenchmarkHashers/8KB_SHA256-16        	1000000000	         0.0000073 ns/op
BenchmarkHashers/16KB_BLAKE3-16       	1000000000	         0.0000087 ns/op
BenchmarkHashers/16KB_SHA256-16       	1000000000	         0.0000148 ns/op
BenchmarkHashers/32KB_BLAKE3-16       	1000000000	         0.0000131 ns/op
BenchmarkHashers/32KB_SHA256-16       	1000000000	         0.0000261 ns/op
BenchmarkHashers/64KB_BLAKE3-16       	1000000000	         0.0000561 ns/op
BenchmarkHashers/64KB_SHA256-16       	1000000000	         0.0000554 ns/op
BenchmarkHashers/128KB_BLAKE3-16      	1000000000	         0.0001044 ns/op
BenchmarkHashers/128KB_SHA256-16      	1000000000	         0.0000985 ns/op
BenchmarkHashers/256KB_BLAKE3-16      	1000000000	         0.0002238 ns/op
BenchmarkHashers/256KB_SHA256-16      	1000000000	         0.0001969 ns/op
BenchmarkHashers/512KB_BLAKE3-16      	1000000000	         0.0002102 ns/op
BenchmarkHashers/512KB_SHA256-16      	1000000000	         0.0003972 ns/op
BenchmarkHashers/1MB_BLAKE3-16        	1000000000	         0.0003518 ns/op
BenchmarkHashers/1MB_SHA256-16        	1000000000	         0.0007869 ns/op
BenchmarkHashers/2MB_BLAKE3-16        	1000000000	         0.0006666 ns/op
BenchmarkHashers/2MB_SHA256-16        	1000000000	         0.001581 ns/op
BenchmarkHashers/4MB_BLAKE3-16        	1000000000	         0.001203 ns/op
BenchmarkHashers/4MB_SHA256-16        	1000000000	         0.003166 ns/op
BenchmarkHashers/8MB_BLAKE3-16        	1000000000	         0.002346 ns/op
BenchmarkHashers/8MB_SHA256-16        	1000000000	         0.006283 ns/op
BenchmarkHashers/16MB_BLAKE3-16       	1000000000	         0.004486 ns/op
BenchmarkHashers/16MB_SHA256-16       	1000000000	         0.01260 ns/op
BenchmarkHashers/32MB_BLAKE3-16       	1000000000	         0.009684 ns/op
BenchmarkHashers/32MB_SHA256-16       	1000000000	         0.02538 ns/op
BenchmarkHashers/64MB_BLAKE3-16       	1000000000	         0.02098 ns/op
BenchmarkHashers/64MB_SHA256-16       	1000000000	         0.05074 ns/op
BenchmarkHashers/128MB_BLAKE3-16      	1000000000	         0.04212 ns/op
BenchmarkHashers/128MB_SHA256-16      	1000000000	         0.1015 ns/op
BenchmarkHashers/256MB_BLAKE3-16      	1000000000	         0.08430 ns/op
BenchmarkHashers/256MB_SHA256-16      	1000000000	         0.2031 ns/op
BenchmarkHashers/512MB_BLAKE3-16      	1000000000	         0.1671 ns/op
BenchmarkHashers/512MB_SHA256-16      	1000000000	         0.4062 ns/op
BenchmarkHashers/1GB_BLAKE3-16        	1000000000	         0.3386 ns/op
BenchmarkHashers/1GB_SHA256-16        	1000000000	         0.8117 ns/op
Linux aarch64
goos: linux
goarch: arm64
BenchmarkHashers/1B_BLAKE3-16         	1000000000	         0.0000062 ns/op
BenchmarkHashers/1B_SHA256-16         	1000000000	         0.0000021 ns/op
BenchmarkHashers/2B_BLAKE3-16         	1000000000	         0.0000094 ns/op
BenchmarkHashers/2B_SHA256-16         	1000000000	         0.0000022 ns/op
BenchmarkHashers/4B_BLAKE3-16         	1000000000	         0.0000057 ns/op
BenchmarkHashers/4B_SHA256-16         	1000000000	         0.0000012 ns/op
BenchmarkHashers/8B_BLAKE3-16         	1000000000	         0.0000062 ns/op
BenchmarkHashers/8B_SHA256-16         	1000000000	         0.0000021 ns/op
BenchmarkHashers/16B_BLAKE3-16        	1000000000	         0.0000054 ns/op
BenchmarkHashers/16B_SHA256-16        	1000000000	         0.0000021 ns/op
BenchmarkHashers/32B_BLAKE3-16        	1000000000	         0.0000070 ns/op
BenchmarkHashers/32B_SHA256-16        	1000000000	         0.0000018 ns/op
BenchmarkHashers/64B_BLAKE3-16        	1000000000	         0.0000063 ns/op
BenchmarkHashers/64B_SHA256-16        	1000000000	         0.0000018 ns/op
BenchmarkHashers/128B_BLAKE3-16       	1000000000	         0.0000072 ns/op
BenchmarkHashers/128B_SHA256-16       	1000000000	         0.0000019 ns/op
BenchmarkHashers/256B_BLAKE3-16       	1000000000	         0.0000073 ns/op
BenchmarkHashers/256B_SHA256-16       	1000000000	         0.0000025 ns/op
BenchmarkHashers/512B_BLAKE3-16       	1000000000	         0.0000090 ns/op
BenchmarkHashers/512B_SHA256-16       	1000000000	         0.0000021 ns/op
BenchmarkHashers/1KB_BLAKE3-16        	1000000000	         0.0000100 ns/op
BenchmarkHashers/1KB_SHA256-16        	1000000000	         0.0000023 ns/op
BenchmarkHashers/2KB_BLAKE3-16        	1000000000	         0.0000157 ns/op
BenchmarkHashers/2KB_SHA256-16        	1000000000	         0.0000029 ns/op
BenchmarkHashers/4KB_BLAKE3-16        	1000000000	         0.0000239 ns/op
BenchmarkHashers/4KB_SHA256-16        	1000000000	         0.0000046 ns/op
BenchmarkHashers/8KB_BLAKE3-16        	1000000000	         0.0000366 ns/op
BenchmarkHashers/8KB_SHA256-16        	1000000000	         0.0000064 ns/op
BenchmarkHashers/16KB_BLAKE3-16       	1000000000	         0.0000664 ns/op
BenchmarkHashers/16KB_SHA256-16       	1000000000	         0.0000125 ns/op
BenchmarkHashers/32KB_BLAKE3-16       	1000000000	         0.0001240 ns/op
BenchmarkHashers/32KB_SHA256-16       	1000000000	         0.0000227 ns/op
BenchmarkHashers/64KB_BLAKE3-16       	1000000000	         0.0002486 ns/op
BenchmarkHashers/64KB_SHA256-16       	1000000000	         0.0000433 ns/op
BenchmarkHashers/128KB_BLAKE3-16      	1000000000	         0.0004805 ns/op
BenchmarkHashers/128KB_SHA256-16      	1000000000	         0.0000853 ns/op
BenchmarkHashers/256KB_BLAKE3-16      	1000000000	         0.0009378 ns/op
BenchmarkHashers/256KB_SHA256-16      	1000000000	         0.0001688 ns/op
BenchmarkHashers/512KB_BLAKE3-16      	1000000000	         0.001875 ns/op
BenchmarkHashers/512KB_SHA256-16      	1000000000	         0.0003356 ns/op
BenchmarkHashers/1MB_BLAKE3-16        	1000000000	         0.003737 ns/op
BenchmarkHashers/1MB_SHA256-16        	1000000000	         0.0006701 ns/op
BenchmarkHashers/2MB_BLAKE3-16        	1000000000	         0.007473 ns/op
BenchmarkHashers/2MB_SHA256-16        	1000000000	         0.001350 ns/op
BenchmarkHashers/4MB_BLAKE3-16        	1000000000	         0.01493 ns/op
BenchmarkHashers/4MB_SHA256-16        	1000000000	         0.002690 ns/op
BenchmarkHashers/8MB_BLAKE3-16        	1000000000	         0.02988 ns/op
BenchmarkHashers/8MB_SHA256-16        	1000000000	         0.005363 ns/op
BenchmarkHashers/16MB_BLAKE3-16       	1000000000	         0.05984 ns/op
BenchmarkHashers/16MB_SHA256-16       	1000000000	         0.01073 ns/op
BenchmarkHashers/32MB_BLAKE3-16       	1000000000	         0.1200 ns/op
BenchmarkHashers/32MB_SHA256-16       	1000000000	         0.02146 ns/op
BenchmarkHashers/64MB_BLAKE3-16       	1000000000	         0.2400 ns/op
BenchmarkHashers/64MB_SHA256-16       	1000000000	         0.04291 ns/op
BenchmarkHashers/128MB_BLAKE3-16      	1000000000	         0.4803 ns/op
BenchmarkHashers/128MB_SHA256-16      	1000000000	         0.08592 ns/op
BenchmarkHashers/256MB_BLAKE3-16.      1000000000	         0.9611 ns/op
BenchmarkHashers/256MB_SHA256-16      1000000000	         0.1717 ns/op
BenchmarkHashers/512MB_BLAKE3-16      	       1	1915870280 ns/op
BenchmarkHashers/512MB_SHA256-16      	1000000000	         0.3437 ns/op
BenchmarkHashers/1GB_BLAKE3-16        	       1	3829768354 ns/op
BenchmarkHashers/1GB_SHA256-16        	1000000000	         0.6874 ns/op
Unsurprisingly this is terrible on arm, but performs quite good on x86_64. Overall, I still think this is net improvement:
  1. sha256 is still supported, there's no regression
  2. The server can support all hashing algorithms at the same time, it is the clients that decides what to use in the requests. If necessary, we can introduce a flag to allow banning some functions from the server side (e.g. if someone does not want to support blake3 intentionally because running on arm)
  3. This opens to more digest functions. SHA256TREE is very promising, although not yet available in bazel. But adding it to bazel-remote would be trivial once it is available, the complexity would just be in the implementation of the hashing algorithm itself.

@mostynb
Copy link
Collaborator

mostynb commented Dec 14, 2023

What are your concerns?

I try to be conservative when it comes to changing the cache directory format, because it can cause trouble for people eg trying out a new bazel-remote version and then switching back to a slightly older version. It's something we can do if there is a good need, it just needs to be thought through pretty well first.

There's a benchmark that can be run with ...

I was more thinking about a more real-world benchmark, like comparing hot and cold cache builds of some appropriately sized project using bazel with blake3 and sha256 (ie not too large that it becomes a pain for us to run). What kind of projects does it help with? And by how much?

@AlessandroPatti
Copy link
Contributor Author

AlessandroPatti commented Dec 23, 2023

@mostynb I've seen most improvements for targets that produce large files, like deployables (fat jars or binaries, docker images). If you can suggest one or two OSS projects to test against I'll happily run the tests. Alternatively, I can propose to land this change without adding the support for blake3 and only keep the part that makes the cache more general towards the hashing function used. This will add support for the new DigestFunction fields in the bre protocol, which a step forward but does not immediately changes the folder structure since sha256 will keep using the cache folder as before. We can then discuss if blake3 is sufficiently better to consider adding the support separately. WDYT?

@AlessandroPatti AlessandroPatti force-pushed the apatti/710/blake3 branch 3 times, most recently from 82005ff to ea588a8 Compare December 23, 2023 13:15
@AlessandroPatti AlessandroPatti force-pushed the apatti/710/blake3 branch 2 times, most recently from 302aa4e to 6717579 Compare January 2, 2024 20:50
@mostynb
Copy link
Collaborator

mostynb commented Jan 3, 2024

@mostynb I've seen most improvements for targets that produce large files, like deployables (fat jars or binaries, docker images). If you can suggest one or two OSS projects to test against I'll happily run the tests.

This might be difficult, but probably worth the effort. Last time I tried something similar I had trouble finding many opensource projects that used bazel and worked with the latest bazel version. How about bazel itself as the first test?

Alternatively, I can propose to land this change without adding the support for blake3 and only keep the part that makes the cache more general towards the hashing function used. This will add support for the new DigestFunction fields in the bre protocol, which a step forward but does not immediately changes the folder structure since sha256 will keep using the cache folder as before. We can then discuss if blake3 is sufficiently better to consider adding the support separately. WDYT?

I think we should wait to see how blake3 performs first.

@jackwellsxyz
Copy link

@mostynb I've seen most improvements for targets that produce large files, like deployables (fat jars or binaries, docker images). If you can suggest one or two OSS projects to test against I'll happily run the tests.

This might be difficult, but probably worth the effort. Last time I tried something similar I had trouble finding many opensource projects that used bazel and worked with the latest bazel version. How about bazel itself as the first test?

Alternatively, I can propose to land this change without adding the support for blake3 and only keep the part that makes the cache more general towards the hashing function used. This will add support for the new DigestFunction fields in the bre protocol, which a step forward but does not immediately changes the folder structure since sha256 will keep using the cache folder as before. We can then discuss if blake3 is sufficiently better to consider adding the support separately. WDYT?

I think we should wait to see how blake3 performs first.

JC - were you able to run the tests against any OSS projects? Did BLAKE3 end up helping?

@AlessandroPatti
Copy link
Contributor Author

@jackwellsxyz @mostynb sorry, I've been busy and I did not get around to run any test. I'll try to find some time in the next few days.

@AlessandroPatti
Copy link
Contributor Author

AlessandroPatti commented Mar 5, 2024

Sorry, this took longer than expected. I tried testing it a bit with bazel itself as @mostynb suggested and did not notice substantial difference (just 1-2% faster). I tested both fully uncached builds and fully cached builds. I've also tested the case where output are already built locally (i.e. their in bazel-out/) but the server has to restart, as my understanding is that bazel will have to compute the hashes of all the local outputs, but also in that case no significant difference.

@jackwellsxyz
Copy link

jackwellsxyz commented May 3, 2024

I think there's still logic in adding this functionality imo. I'm working on a repo that enforces BLAKE3 and so I have to set my bazelrc to explicitly use SHA256. Issue there is that I can't use Bazel module lock files since bazel "updates" all of the BLAKE3 bzlTransitiveDigest hashes to SHA256 ones during the pre-commit checks that use bazel to run things like detekt on files for check-in.

@hunshcn
Copy link
Contributor

hunshcn commented May 12, 2024

I regret that I found this pr only after I completed the support of sha512 based on #175. https://github.com/hunshcn/bazel-remote/tree/feat/cas-md5-sha1-sha512

Is there an estimated landing time? I hope to increase the support of sha512 based on this.

Because the npm dependency of rules_js only supports sha512.

@AlessandroPatti AlessandroPatti changed the title Add support for blake3 and more digest functions Add support for more digest functions May 13, 2024
@AlessandroPatti
Copy link
Contributor Author

@mostynb do you think we can land this change to support other digest functions?

@mostynb
Copy link
Collaborator

mostynb commented May 13, 2024

I am reluctant to land this change in the near future, but having said that I will try to find some time next week to read through this again. The monthly REAPI working group meeting is tomorrow, that will take up a fair bit of my bazel-remote time budget this week.

@hunshcn
Copy link
Contributor

hunshcn commented Jun 11, 2024

any update?

bazel 7.2.0 has been released.

bazelbuild/bazel#21996

Remote Asset Downloader support digest func now.

@hunshcn
Copy link
Contributor

hunshcn commented Jun 18, 2024

@mostynb

@AlessandroPatti AlessandroPatti force-pushed the apatti/710/blake3 branch 2 times, most recently from 89511a3 to 8986fdd Compare July 2, 2024 18:22
@AlessandroPatti
Copy link
Contributor Author

@mostynb did you get around to review this? Would love to hear your feedback to get this to a mergable state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Blake3 support
4 participants