-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for more digest functions #715
base: master
Are you sure you want to change the base?
Conversation
724429f
to
7bae689
Compare
I have a few concerns about landing this feature, but since you have a prototype is it something we can get some benchmark numbers for? How much does switching from sha256 to blake3 improve build times? |
What are your concerns? There's a benchmark that can be run with Linux x86_64
Linux aarch64
|
I try to be conservative when it comes to changing the cache directory format, because it can cause trouble for people eg trying out a new bazel-remote version and then switching back to a slightly older version. It's something we can do if there is a good need, it just needs to be thought through pretty well first.
I was more thinking about a more real-world benchmark, like comparing hot and cold cache builds of some appropriately sized project using bazel with blake3 and sha256 (ie not too large that it becomes a pain for us to run). What kind of projects does it help with? And by how much? |
@mostynb I've seen most improvements for targets that produce large files, like deployables (fat jars or binaries, docker images). If you can suggest one or two OSS projects to test against I'll happily run the tests. Alternatively, I can propose to land this change without adding the support for blake3 and only keep the part that makes the cache more general towards the hashing function used. This will add support for the new |
82005ff
to
ea588a8
Compare
302aa4e
to
6717579
Compare
This might be difficult, but probably worth the effort. Last time I tried something similar I had trouble finding many opensource projects that used bazel and worked with the latest bazel version. How about bazel itself as the first test?
I think we should wait to see how blake3 performs first. |
JC - were you able to run the tests against any OSS projects? Did BLAKE3 end up helping? |
@jackwellsxyz @mostynb sorry, I've been busy and I did not get around to run any test. I'll try to find some time in the next few days. |
Sorry, this took longer than expected. I tried testing it a bit with bazel itself as @mostynb suggested and did not notice substantial difference (just 1-2% faster). I tested both fully uncached builds and fully cached builds. I've also tested the case where output are already built locally (i.e. their in bazel-out/) but the server has to restart, as my understanding is that bazel will have to compute the hashes of all the local outputs, but also in that case no significant difference. |
I think there's still logic in adding this functionality imo. I'm working on a repo that enforces BLAKE3 and so I have to set my bazelrc to explicitly use SHA256. Issue there is that I can't use Bazel module lock files since bazel "updates" all of the BLAKE3 bzlTransitiveDigest hashes to SHA256 ones during the pre-commit checks that use bazel to run things like detekt on files for check-in. |
I regret that I found this pr only after I completed the support of sha512 based on #175. https://github.com/hunshcn/bazel-remote/tree/feat/cas-md5-sha1-sha512 Is there an estimated landing time? I hope to increase the support of sha512 based on this. Because the npm dependency of rules_js only supports sha512. |
6717579
to
9c40d67
Compare
9c40d67
to
3e022eb
Compare
@mostynb do you think we can land this change to support other digest functions? |
I am reluctant to land this change in the near future, but having said that I will try to find some time next week to read through this again. The monthly REAPI working group meeting is tomorrow, that will take up a fair bit of my bazel-remote time budget this week. |
any update? bazel 7.2.0 has been released. Remote Asset Downloader support digest func now. |
89511a3
to
8986fdd
Compare
8986fdd
to
8da4749
Compare
@mostynb did you get around to review this? Would love to hear your feedback to get this to a mergable state |
Generalize the cache to support multiple digest functions. This PR updates the remote_execution proto definitions and introduces a new interface
hashing.Hasher
that can be used to compute, validate as store blobs for a givenDigestFunction
.All blobs are stored as
<kind>/<digest function>/<path>
,<kind>
being one ofcas
,cas.v2
,ac
orraw
,<digest function>
being the name of the digest function (e.g.blake3
) and<path>
being the current blob, sharded by prefix (e.g.f1/f1...
). The exception to this rule issha256
, for which we drop the<digest function>
component to maintain full backwards compatibility.This change does not add any additional function, but they can be added as needed by adding a new type that implements
hashing.Hasher
and registering it withhashing.register
(tested withsha1
as well, but did not include it in this PR).In order to keep supporting other instances of bazel remote as backend proxy with the new digest functions, we additionally now set and get the
X-Digest-Function
header in each request.Fixes #710