Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-884 Update recommendations for Pod resource management #946

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

JakeSCahill
Copy link
Contributor

@JakeSCahill JakeSCahill commented Jan 15, 2025

Description

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 17 Jan

The new advice is to rely on K8s resource requests and limits.

Related PR: redpanda-data/helm-charts#1622

Page previews

https://deploy-preview-946--redpanda-docs-preview.netlify.app/current/manage/kubernetes/k-manage-resources/#memory

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

The new advice is to rely on K8s resource requests and limits
@JakeSCahill JakeSCahill requested a review from chrisseto January 15, 2025 16:22
@JakeSCahill JakeSCahill requested a review from a team as a code owner January 15, 2025 16:22
Copy link

netlify bot commented Jan 15, 2025

Deploy Preview for redpanda-docs-preview ready!

Name Link
🔨 Latest commit 2a50015
🔍 Latest deploy log https://app.netlify.com/sites/redpanda-docs-preview/deploys/6787e0cf1823d8000856d5b9
😎 Deploy Preview https://deploy-preview-946--redpanda-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Jan 15, 2025

Deploy Preview for redpanda-docs-preview ready!

Name Link
🔨 Latest commit 867c9aa
🔍 Latest deploy log https://app.netlify.com/sites/redpanda-docs-preview/deploys/6793c960dffcc100086e3637
😎 Deploy Preview https://deploy-preview-946--redpanda-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@JakeSCahill JakeSCahill changed the title Update recommendations for Pod resource management DOC-884 Update recommendations for Pod resource management Jan 16, 2025
modules/manage/pages/kubernetes/k-manage-resources.adoc Outdated Show resolved Hide resolved

By default, the Helm chart allocates 80% of the configured memory in `resources.memory.container` to Redpanda, with the remaining reserved for overhead such as the Seastar subsystem and other container processes.
Redpanda Data recommends this default setting.
=== Memory allocation and Seastar flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC do we talk about resource allocation on baremetal? Last I did a search the only instances of --memory and the like showed up in our Kubernetes docs which should ideally be abstracting those details away 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a Jira to document the relevant Seastar flags for bare-metal deployments. But since we don't have that yet, I wanted to minimally explain the two flags that affect the two memory management options in K8s.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha. In that case I'm going to defer to either @dotnwat or @travisdowns on the descriptions here to avoid saying anything incorrect. My mental model of --reserve-memory is that it's needlessly convoluted which we probably don't want to put in our docs.

# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits)
# min:
max: <number><unit> <2>
requests:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resources.{requests,limits} is mutually exclusive with resources.{memory,cpu,container}. If requests and limits are provided, memory, cpu, and container, will be ignored.

Memory locking can be enabled with either statefulset.additionalRedpandaCmdFlags or some values in config. I know the key is enable_memory_locking but I'm not sure which stanza it goes into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our existing doc already mentioned resources.memory.enable_memory_locking. Should we keep that as a recommended way of setting memory locking? Or is it better to use statefulset.additionalRedpandaCmdFlags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any baremetal docs that mention memory locking? If so, I'd vote to align with those be it the CLI flag or the rpk config.

If not, I guess it's dealers choice? I don't like resources.memory.enable_memory_locking because it's specific to kubernetes. I like to keep interfaces as consistent as possible so transitioning from baremetal to k8s or visa versa will lead to users thinking: "<blank> is a CLI flag/RPK Config. Okay, how do I set that in Kubernetes/Baremetal" rather than needing to have a bespoke answer for every piece of configuration.

@@ -285,6 +305,8 @@ If Redpanda runs in a shared environment, where multiple applications run on the

You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`.

NOTE: Setting `resources.requests.cpu` to a fractional value, such as 200m, enables Kubernetes to schedule Pods alongside other workloads efficiently, ensuring fair resource distribution. However, this may impact Redpanda's performance under heavy loads.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where things start to get a bit nuanced. redpanda can't effectively make use of fractional CPU values. So the --smp flag is always rounded down with a minimum value of one. So 200m (--smp=1) will only work with the overprovisioned flag and 2500m (--smp=2) will only result in redpanda utilizing 2 cores.

Copy link
Contributor Author

@JakeSCahill JakeSCahill Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true for resources.cpu.cores? Our docs currently say that fractional CPU values are allowed for that config.

I think given this:

resources.{requests,limits} is mutually exclusive with resources.{memory,cpu,container}. If requests and limits are provided, memory, cpu, and container, will be ignored.

I should revert changes to the shared environments section since users will still need to use resources.cpu settings. Perhaps we need a note like:

NOTE: When `resources.requests` or `resources.limits` are set,
the `resources.cpu` parameter (including cores) is ignored.
Ensure that you have not configured CPU requests and limits explicitly
to avoid unexpected behavior in shared environments.

@@ -336,7 +360,7 @@ helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --crea
+
```bash
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set resources.cpu.cores=<number-of-cpu-cores> \
--set resources.requests.cpu=<number-of-cpu-cores> \
--set resources.cpu.overprovisioned=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so same deal with the overprovisioned flag. It's now a CLI flag (like it is in redpanda) or it can be set in the redpanda.yaml which causes rpk to set the flag on you behalf.

@JakeSCahill JakeSCahill requested a review from chrisseto January 20, 2025 15:54
Copy link
Contributor

@chrisseto chrisseto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if the switcharoo on the memory locking and overprovisioned configs caused the scope to increase on this a bit. I was hoping to streamline configuration management a bit.

# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits)
# min:
max: <number><unit> <2>
requests:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any baremetal docs that mention memory locking? If so, I'd vote to align with those be it the CLI flag or the rpk config.

If not, I guess it's dealers choice? I don't like resources.memory.enable_memory_locking because it's specific to kubernetes. I like to keep interfaces as consistent as possible so transitioning from baremetal to k8s or visa versa will lead to users thinking: "<blank> is a CLI flag/RPK Config. Okay, how do I set that in Kubernetes/Baremetal" rather than needing to have a bespoke answer for every piece of configuration.


By default, the Helm chart allocates 80% of the configured memory in `resources.memory.container` to Redpanda, with the remaining reserved for overhead such as the Seastar subsystem and other container processes.
Redpanda Data recommends this default setting.
=== Memory allocation and Seastar flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha. In that case I'm going to defer to either @dotnwat or @travisdowns on the descriptions here to avoid saying anything incorrect. My mental model of --reserve-memory is that it's needlessly convoluted which we probably don't want to put in our docs.

@@ -283,7 +303,9 @@ If you use PersistentVolumes, you can set the storage capacity for each volume.

If Redpanda runs in a shared environment, where multiple applications run on the same worker node, you can make Redpanda less aggressive in CPU usage by enabling overprovisioning. This adjustment ensures a fairer distribution of CPU time among all processes, improving overall system efficiency at the cost of Redpanda's performance.

You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`.
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: clearify that fractional must be < 1 for overprovisioned to be set

Suggested change
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`.
You can enable overprovisioning by either setting the CPU request to a fractional value less than 1 or setting `resources.cpu.overprovisioned` to `true`.

And similarly to the comment above, overprovisioned can be set with either a CLI flag or an RPK config as well. Let's align on either CLI flags or RPK config values for this as well.

You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`.
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`.

NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is that both requests and limits are set:

Suggested change
NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.
NOTE: When `resources.requests` and `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.

I'm not sure what the final sentence here means?

<2> The amount of memory to give Redpanda, Seastar, and the other container processes. You should give Redpanda at least 2 Gi of memory per core. Given that the Helm chart allocates 80% of the container's memory to Redpanda, leaving the rest for the Seastar subsystem and other processes, set this value to at least 2.5 Gi per core to ensure Redpanda has a full 2 Gi. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^].
<1> Enabling memory locking prevents the operating system from paging out Redpanda's memory to disk. This can significantly improve performance by ensuring Redpanda has uninterrupted access to its allocated memory.

<2> Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the `--memory` flag. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory units are converted to the nearest whole MiB.

nit: I'm not 100% certain how to best phrase this because I'd describe it in code 😓 but memory units are floored or truncated to MiB. Rounded down, maybe?

Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the --memory flag.

This math doesn't add up. Wouldn't that be ~2.2Gi per core? Or do we want there to be 2.5Gi per core intentionally including the overhead? (This may be a good question for core performance)

@JakeSCahill
Copy link
Contributor Author

Thanks @chrisseto ! I've adapted all the examples to use the usual bare-metal flags with additionalRedpandaCmdFlags in favor of K8s-specific stanzas.

I've also clarified all the points you mentioned.

I took the description of --reserve-memory from the code comments in the CRD.

@JakeSCahill JakeSCahill requested a review from chrisseto January 24, 2025 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants