-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-884 Update recommendations for Pod resource management #946
base: main
Are you sure you want to change the base?
Conversation
The new advice is to rely on K8s resource requests and limits
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
||
By default, the Helm chart allocates 80% of the configured memory in `resources.memory.container` to Redpanda, with the remaining reserved for overhead such as the Seastar subsystem and other container processes. | ||
Redpanda Data recommends this default setting. | ||
=== Memory allocation and Seastar flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOC do we talk about resource allocation on baremetal? Last I did a search the only instances of --memory
and the like showed up in our Kubernetes docs which should ideally be abstracting those details away 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a Jira to document the relevant Seastar flags for bare-metal deployments. But since we don't have that yet, I wanted to minimally explain the two flags that affect the two memory management options in K8s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha. In that case I'm going to defer to either @dotnwat or @travisdowns on the descriptions here to avoid saying anything incorrect. My mental model of --reserve-memory
is that it's needlessly convoluted which we probably don't want to put in our docs.
# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits) | ||
# min: | ||
max: <number><unit> <2> | ||
requests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resources.{requests,limits}
is mutually exclusive with resources.{memory,cpu,container}
. If requests and limits are provided, memory, cpu, and container, will be ignored.
Memory locking can be enabled with either statefulset.additionalRedpandaCmdFlags
or some values in config
. I know the key is enable_memory_locking
but I'm not sure which stanza it goes into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our existing doc already mentioned resources.memory.enable_memory_locking
. Should we keep that as a recommended way of setting memory locking? Or is it better to use statefulset.additionalRedpandaCmdFlags
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any baremetal docs that mention memory locking? If so, I'd vote to align with those be it the CLI flag or the rpk config.
If not, I guess it's dealers choice? I don't like resources.memory.enable_memory_locking
because it's specific to kubernetes. I like to keep interfaces as consistent as possible so transitioning from baremetal to k8s or visa versa will lead to users thinking: "<blank>
is a CLI flag/RPK Config. Okay, how do I set that in Kubernetes/Baremetal" rather than needing to have a bespoke answer for every piece of configuration.
@@ -285,6 +305,8 @@ If Redpanda runs in a shared environment, where multiple applications run on the | |||
|
|||
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`. | |||
|
|||
NOTE: Setting `resources.requests.cpu` to a fractional value, such as 200m, enables Kubernetes to schedule Pods alongside other workloads efficiently, ensuring fair resource distribution. However, this may impact Redpanda's performance under heavy loads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where things start to get a bit nuanced. redpanda can't effectively make use of fractional CPU values. So the --smp
flag is always rounded down with a minimum value of one. So 200m (--smp=1) will only work with the overprovisioned
flag and 2500m
(--smp=2) will only result in redpanda utilizing 2 cores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this true for resources.cpu.cores
? Our docs currently say that fractional CPU values are allowed for that config.
I think given this:
resources.{requests,limits} is mutually exclusive with resources.{memory,cpu,container}. If requests and limits are provided, memory, cpu, and container, will be ignored.
I should revert changes to the shared environments section since users will still need to use resources.cpu
settings. Perhaps we need a note like:
NOTE: When `resources.requests` or `resources.limits` are set,
the `resources.cpu` parameter (including cores) is ignored.
Ensure that you have not configured CPU requests and limits explicitly
to avoid unexpected behavior in shared environments.
@@ -336,7 +360,7 @@ helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --crea | |||
+ | |||
```bash | |||
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ | |||
--set resources.cpu.cores=<number-of-cpu-cores> \ | |||
--set resources.requests.cpu=<number-of-cpu-cores> \ | |||
--set resources.cpu.overprovisioned=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so same deal with the overprovisioned flag. It's now a CLI flag (like it is in redpanda) or it can be set in the redpanda.yaml which causes rpk to set the flag on you behalf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if the switcharoo on the memory locking and overprovisioned configs caused the scope to increase on this a bit. I was hoping to streamline configuration management a bit.
# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits) | ||
# min: | ||
max: <number><unit> <2> | ||
requests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any baremetal docs that mention memory locking? If so, I'd vote to align with those be it the CLI flag or the rpk config.
If not, I guess it's dealers choice? I don't like resources.memory.enable_memory_locking
because it's specific to kubernetes. I like to keep interfaces as consistent as possible so transitioning from baremetal to k8s or visa versa will lead to users thinking: "<blank>
is a CLI flag/RPK Config. Okay, how do I set that in Kubernetes/Baremetal" rather than needing to have a bespoke answer for every piece of configuration.
|
||
By default, the Helm chart allocates 80% of the configured memory in `resources.memory.container` to Redpanda, with the remaining reserved for overhead such as the Seastar subsystem and other container processes. | ||
Redpanda Data recommends this default setting. | ||
=== Memory allocation and Seastar flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha. In that case I'm going to defer to either @dotnwat or @travisdowns on the descriptions here to avoid saying anything incorrect. My mental model of --reserve-memory
is that it's needlessly convoluted which we probably don't want to put in our docs.
@@ -283,7 +303,9 @@ If you use PersistentVolumes, you can set the storage capacity for each volume. | |||
|
|||
If Redpanda runs in a shared environment, where multiple applications run on the same worker node, you can make Redpanda less aggressive in CPU usage by enabling overprovisioning. This adjustment ensures a fairer distribution of CPU time among all processes, improving overall system efficiency at the cost of Redpanda's performance. | |||
|
|||
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`. | |||
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: clearify that fractional must be < 1 for overprovisioned to be set
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`. | |
You can enable overprovisioning by either setting the CPU request to a fractional value less than 1 or setting `resources.cpu.overprovisioned` to `true`. |
And similarly to the comment above, overprovisioned can be set with either a CLI flag or an RPK config as well. Let's align on either CLI flags or RPK config values for this as well.
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`. | ||
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`. | ||
|
||
NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is that both requests and limits are set:
NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments. | |
NOTE: When `resources.requests` and `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments. |
I'm not sure what the final sentence here means?
<2> The amount of memory to give Redpanda, Seastar, and the other container processes. You should give Redpanda at least 2 Gi of memory per core. Given that the Helm chart allocates 80% of the container's memory to Redpanda, leaving the rest for the Seastar subsystem and other processes, set this value to at least 2.5 Gi per core to ensure Redpanda has a full 2 Gi. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^]. | ||
<1> Enabling memory locking prevents the operating system from paging out Redpanda's memory to disk. This can significantly improve performance by ensuring Redpanda has uninterrupted access to its allocated memory. | ||
|
||
<2> Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the `--memory` flag. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory units are converted to the nearest whole MiB.
nit: I'm not 100% certain how to best phrase this because I'd describe it in code 😓 but memory units are floor
ed or truncated to MiB. Rounded down, maybe?
Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the
--memory
flag.
This math doesn't add up. Wouldn't that be ~2.2Gi per core? Or do we want there to be 2.5Gi per core intentionally including the overhead? (This may be a good question for core performance)
Thanks @chrisseto ! I've adapted all the examples to use the usual bare-metal flags with I've also clarified all the points you mentioned. I took the description of |
Description
Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 17 Jan
The new advice is to rely on K8s resource requests and limits.
Related PR: redpanda-data/helm-charts#1622
Page previews
https://deploy-preview-946--redpanda-docs-preview.netlify.app/current/manage/kubernetes/k-manage-resources/#memory
Checks