-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache use discussion #5094
Comments
This is a great initiative for streamlining our caching strategy and also our understanding/documentation of it!
To be precise, the database is persisted on disk and users have the ability to use a PV for durability between pod restarts.
Just to get clarity here: Are you suggesting to use a pure in-memory cache? I don't think this is what we want for IRC in order to prevent the controller from hammering all registries upon restart. |
I think right now, upon restart all repos are scanned as the reconciler will fill the queue with all objects. |
I don't think that's true as the shouldScan method would determine if a repository should be scanned and that depends on the last scan time. So if the controller comes up for the first time (e.g. after a Pod restart), there are tags for the given repository in the database and the last scan time is not too far back, it will not be scanned. |
@makkes I'm just putting all the partial ideas that I've heard of together here as I too wasn't sure about the plan for IRC. I came across this in fluxcd/pkg#778 which states that tags will be stored in the cache and the cache data will be persisted on disk and reloaded on restart. As mentioned above, I don't see any change in the behavior from this compared to what we have today with badger database if a persistent volume is used. Just that the overhead of running badger is replaced by the cache which stores everything in memory. |
We discussed a little about this in the dev meeting today. We will discuss more in detail next week. We discussed about the image-reflector-controller use case. The cache idea for image tags to replace badger came about as a way to address the memory issue seen in low-memory environment, refer https://fluxcd.io/flux/cheatsheets/troubleshooting/#how-do-i-resolve-a-unable-to-open-the-badger-database-that-puts-image-reflector-controller-in-crashloopbackoff. But as mentioned in the last comment, there are trade-offs in moving everything to an in-memory cache. We don't know if the latest versions of badger have made some adjustments and we don't really have this issue anymore. This needs to be verified. Considering the potential issues with moving everything in memory, especially for large deployments with lot of images and tags, we'll avoid the idea of adding a cache to replace badger for now. I'll remove references of image-reflector-controller and image tags from the main cache use discussion comment above. |
I just added a section Cache capacity full behavior with some questions around it. |
In this week's dev meeting, we didn't get to go through all the questions but we discussed if there is still a need for the LRU cache code, now that we don't have a plan for cache in image-reflector-controller. I'd like to wait for any other opinion or concerns about this idea before adding it in the main comment above. So far, it sounds like a good idea to me, personally. |
This reasoning also sounds good to me 👍 |
Just to clary this, we're not going to store all the indexes in memory all the time, but only the number of items that the cluster admin sets in the |
This issue is for discussing various considerations regarding cache use across
Flux in order to have coherency and consistency in the design. The following
sections will provide some background and introduction to various topics with
open ended questions. These will be updated as we discuss, find conclusions and
come to agreements on the design. The goal is to have a well defined
specification at the end.
Background
At present, the source-controller contains a Helm index cache for storing the
parsed index of a Helm repository once downloaded from remote. This cached index
is loaded from the cache and used in further reconciliations by the HelmChart
reconciler for building a chart. This helps keep the memory usage of
source-controller low. Similarly, there's a need to have cache in various Flux
components to improve the performance, resource usage and interaction with
external systems.
The cache in source-controller at present is based on
https://github.com/patrickmn/go-cache. There are new generic cache
implementations available at https://github.com/fluxcd/pkg/tree/main/cache for
expiring and least recently used cache types.
Cache use cases
Authentication token caching
Flux integrates with various cloud providers and platforms that provided
short-lived authentication tokens. At present, every time a Flux component needs
to authenticate with a service, it obtains a new token, uses it and it gets
deleted. These tokens are short-lived but can be used multiple times within
their short expiry period. Requesting a new token at every reconciliation
interval could result in too many requests to the token provider service and
potentially getting rate limited. Flux components should be able to obtain
tokens, cache them for the time they are valid and reuse them if possible.
GitRepository, OCIRepository, HelmRepository(OCI) and Kustomization(for
obtaining secrets from Key Management Service(KMS)) would benefit from
this.
Questions:
If the KMS secrets are considerably long-lived, should we intentionally set
short expiry time for them to avoid storing sensitive data for long?
Image tag caching
image-reflector-controller downloads image tags and stores them in a badger
database. The image tags in the database remain in the database as long as the
controller runs. There is no provision to clear the unused image tags that are
present in the database. There is also an overhead in running the badger
database that results in relatively high memory usage by the controller. The
database can be replaced by a Least Recently Used(LRU) cache which can store
image tags, similar to the database, and also clear the least recently used
image tags from the cache when a certain cache size limit is reached.
Saving and reloading the cache data
In large scale deployments, being able to save the cache on disk in case of a
restart and reload the cache to resume the operations would be ideal.
In case of authentication tokens, if a Flux component restarts, it'll reconcile
every single object, irrespective of their reconciliation interval. This would
result in a surge in requests for new authentication tokens, which can overwhelm
the token provider service. Being able to persist the cached tokens and reuse
them would help avoid the surge in requests to token provider service.
In case of image tags, it wouldn't be much different from using a persistent
storage for badger database. Being able to save and reload the cache would make
sure that the data persistence feature continues to work like before using a
cache.
NOTE: This section exists just for clearing the intent. The details of the
implementation will be discussed separately in the future. This feature is
intended to be added later.
Cache key
Since Flux supports multi-tenancy, the cache usage needs to respect the
multi-tenancy boundaries. Tenants should not be able to access other tenant's
cached data. The cache data is accessed through the cache keys. A tenant should
not be able to configure Flux resource such that their resource uses the cached
data of another resource that belongs to another tenant.
In case of the existing Helm index cache, the cache key used is the
HelmRepository artifact path, which is
<kind>/<namespace>/<name>/<artifact-digest>
. For examplehelmrepository/default/podinfo/index-83a3c595163a6ff0333e0154c790383b5be441b9db632cb36da11db1c4ece111.yaml
.This makes sure that resources from another namespace can't use the key of
a resource in another namespace. Within the same namespace, two artifact with
same index but different resource name will not be able to share the same cached
item.
Questions:
input strings leaves a possibility of the key to be trimmed at the key data
type capacity. Creating a possibility of conflicting keys for different
resources. Is this a good reason to use resource's unique identifier(UID)
instead? The UIDs of the resources are generated by kube-apiserver and can't
be manipulated by user input. It may not be critical for helm index cache to
change the key now, but auth token cache is more sensitive.
Cloud Provider Workload identity
Flux components support workload identity for obtaining short-lived
authentication tokens using the ServiceAccount of the pod. A pod can have only
one identity that's shared by all the Flux resources managed by the pod. Using a
single common cache key, like the cloud provider name, would work.
There's a need to allow multi-tenant OIDC authentication by specifying the
ServiceAccount to use by the resource, refer #5022. This would allow every resource to
have a unique identity. Considering this, it would be better to have unique
cache key for these multi-tenant authentication tokens or resources.
Questions:
ServiceAccount, as that contains the identity of the resource, or it
could also be unique for every resource. If it's based on the ServiceAccount,
two resources referring the same ServiceAccount identity can share the cached
token. Otherwise, in case of unique key for every resource, the cache is not
shared with any other resource, but it is still reused by the same resource.
Which strategy would be better? Any downsides of one or the other?
using the UID of these objects would provide a unique key value which is a
fixed reference to the object. There's a caution expressed in
Implement new packages auth, azure and git for passwordless authentication scenarios pkg#789 (comment) regarding this
for cache invalidation when the object identity is removed or changed. The
first time there's a change, the old cached data will be used. This was
previously addressed in
Implement new packages auth, azure and git for passwordless authentication scenarios pkg#789 (comment) by considering
the cache as a best-effort solution. When the old cached data usage results in
a failure, a new token using the changed identity is obtained and cached for
the same resource. Is this sufficient? Or do we need to add a mechanism to
determine if the identity has changed. It may be difficult to know without
reading the content of the identity as the identity can change for the same
object name and metadata.
The above questions would help decide the designs for both OCI and Git client's
cache use with workload identity.
OAuth identity
For Git authentication on platforms like GitHub and GitLab specifically, OAuth
is used to obtain short-lived tokens. Unlike in the case of workload identity,
in this case, every resource refers to a Secret containing authentication
materials to perform a token exchange. The Secret reference can be comparable to
the ServiceAccount identity in case of workload identity mentioned above.
Similar set of questions and arguments apply to this too regarding the cache key
value to use.
NOTE: Previously, there had been discussions about using the resource URL or
the account ID as the cache key. These are not multi-tenant friendly. The above
considerations of using the resource or the ServiceAccount/Secret specific
unique key helps maintain multi-tenancy by not allowing resources to access
other's cached values by changing the account or repository name in the input.
Cache capacity full behavior
In case of an LRU cache, as new items get added, the old ones are moved out of
the cache. The user of the cache can continue write to the cache. But in case of
expiring cache, the caller needs to handle the scenario where the cache is full. In
this scenario, in most of the cases, the caller should ignore the failure and continue
functioning. The item that is usually stored in the cache won't be in cache, until
existing items expire and cache has capacity to have new items. If needed, the
cluster admins can increase the cache size through a flag.
Questions:
there is
cached_items
metric which shows the total items in the cache. The totalsize can also be exported, which can be used to determine the available capacity.
cache size from interfering with other tenant's usage?
Cache organization
The caches will be used for storing values of different types:
Due to the cache being generic typed cache, they have to be instantiated
separately for their use case. The Helm repository index, authentication token
and image tags caches have to be independent caches due to the data type of the
value they store.
In case of authentication token cache, they store string values. This cache use
is common amongst some resources, specifically in the case of source-controller,
the GitRepository, HelmRepository(OCI) and OCIRepository auth tokens. They can
potentially use a single instance of cache. But for separation of concerns and
independence, they will have separate cache instances. This will also prevent
any potential of interfering resources of other types through the shared cache.
Cache metrics
The new generic cache is self-instrumenting and automatically exports metrics if
a metrics registerer is configured. In order to have unique metrics for each of
the caches, unique prefixes are used. For example, for Git auth token cache, the
prefix
gotk_git_auth_
will be used, resulting in metrics likegotk_git_auth_cache_events_total
,gotk_git_auth_cached_items
, etc. Seefluxcd/source-controller#1644 for further details.
The following cache metrics prefixes will be used:
gotk_helm_index_
gotk_git_auth_
gotk_oci_auth_
gotk_helm_oci_auth_
gotk_kms_auth_
gotk_image_tags_
Questions:
gotk_kms_auth_
accurate for how it'll be used?Cache default configurations
The existing Helm index cache is an expiring cache with the following default
configurations:
For authentication token cache, the item TTL can be set automatically based on
the response from the token provider service.
The following default configurations will be used for auth token cache for
GitRepository, HelmRepository(OCI) and OCIRepository:
The following default configurations will be used for auth token cache for
Kustomization:
The image tags cache will use an LRU cache which only supports setting a max
size. The default size will be _ ?
Questions:
TTL?
Exposing cache configuration controls
For the existing Helm index cache, the following command-line flags are provided
for configuring it:
--helm-cache-max-size
- for setting the cache max size--helm-cache-ttl
- for setting the time to live of cached items--helm-cache-purge-interval
- for setting the cache purge/cleanup intervalBased on the previous section, only the max size and purge interval
configuration flags will be needed for auth token caches.
The following flags will be provided for auth token caches:
The following flags will be provided for image tags cache:
--image-tags-cache-max-size
Questions:
HelmRepository(OCI) and OCIRepository be provided independent of one
another? Or should there be a common flag for all the auth cache
configuration? Even through internally there are separate caches, they can be
presented as a single configuration to the users for simplicity.
The text was updated successfully, but these errors were encountered: