Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action should have automatic cache busting mechanism, or more docs about cache busting #32

Open
sureshjoshi opened this issue Jun 7, 2024 · 0 comments

Comments

@sureshjoshi
Copy link
Member

In the example projects, we have this handy piece of info:

# Note that named_caches and lmdb_store falls back to partial restore keys which
# may give a useful partial result that will save time over completely clean state,
# but will cause the cache entry to grow without bound over time.
# See https://www.pantsbuild.org/2.21/docs/using-pants/using-pants-in-ci for tips on how to periodically clean it up.
# Alternatively you change gha-cache-key to ignore old caches.

And then we have the suggestion to use this action, and instructions about manual usage and a cache nuke function: https://www.pantsbuild.org/2.21/docs/using-pants/using-pants-in-ci#directories-to-cache

Problem is, as the partial restore key is so lenient - and the cache key is strict enough, that using the nuke function from the docs won't work most of the time.

To reduce the monotonically increasing cache usage, a user will need to explicitly and manually change the cache key, or run a nuke function in the actions that will ALSO have an impact on cache saving (e.g. lockfiles change, pants.toml change, etc).


I used https://github.com/sureshjoshi/pants-plugins as a cache testing example:

cache-not-busting

With the second-last entry, in spite of removing almost all dependencies in that commit, we’re still pulling 220MB of cache - and that never gets cleared out. We have to explicitly bust the cache with a new cache key, and run everything from scratch to get the benefit.


Here is another example where I nuke the cache, but since the cache key doesn't change - this gives the "Cache hit occurred ... not saving cache"

image


I had the idea to try to use the gh cli to prematurely delete/expire caches, but since this would happen after the cache is downloaded - it would require special treatment.

I think the most reasonable, practical answer is to add some more documentation to this Action (and probably pantsbuild.org), as well as having some sort of automatic nuke-check on cache saving.

This might require using the restore/save cache actions, if there is no hook on cache itself to know if the saving cache key will be invalidated easily.

Essentially:

  • Run the action as normal
  • During post-action hooks, ask if it's a new cache key? (e.g. was pants.toml or named-caches-hash modified)
    • If not, do nothing
    • If so, run nuke_if_too_big $named_cache_dir $named_cache_limit_mb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant