Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-884 Update recommendations for Pod resource management #946

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 53 additions & 31 deletions modules/manage/pages/kubernetes/k-manage-resources.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,32 @@ kubectl describe nodes
[[memory]]
== Configure memory resources

On a worker node, Kubernetes and Redpanda processes are running at the same time, including the Seastar subsystem that is built into the Redpanda binary. Each of these processes consumes memory. You can configure the memory resources that are allocated to these processes.
On a worker node, Kubernetes and Redpanda processes are running at the same time. Redpanda's memory usage is influenced by its architecture, which leverages the Seastar framework for efficient performance.

By default, the Helm chart allocates 80% of the configured memory in `resources.memory.container` to Redpanda, with the remaining reserved for overhead such as the Seastar subsystem and other container processes.
Redpanda Data recommends this default setting.
=== Memory allocation and Seastar flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC do we talk about resource allocation on baremetal? Last I did a search the only instances of --memory and the like showed up in our Kubernetes docs which should ideally be abstracting those details away 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a Jira to document the relevant Seastar flags for bare-metal deployments. But since we don't have that yet, I wanted to minimally explain the two flags that affect the two memory management options in K8s.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha. In that case I'm going to defer to either @dotnwat or @travisdowns on the descriptions here to avoid saying anything incorrect. My mental model of --reserve-memory is that it's needlessly convoluted which we probably don't want to put in our docs.


NOTE: Although you can also allocate the exact amount of memory for Redpanda and the Seastar subsystem manually, Redpanda Data does not recommend this approach because setting the wrong values can lead to performance issues, instability, or data loss. As a result, this approach is not documented here.
Redpanda uses the following Seastar flags to control memory allocation:

[cols="1m,2a"]
|===
|Seastar Flag|Description
chrisseto marked this conversation as resolved.
Show resolved Hide resolved

|--memory
|Specifies the memory available to the Redpanda process. This value directly impacts Redpanda's ability to manage workloads efficiently.

|--reserve-memory
|Reserves a part of memory for system overheads such as non-heap memory, page tables, and other non-Redpanda operations. This flag is designed for Seastar running on a dedicated VM rather than inside a container.
|===

*Default (legacy) behavior*: By default, the Helm chart allocates 80% of the memory in `resources.memory.container` to `--memory` and reserves 20% for `--reserve-memory`. This is legacy behavior to maintain backward compatibility. Do not use this default in production.

*Production recommendation*: Use `resources.requests.memory` for production deployments. This configuration:

- Sets `--memory` to 90% of the requested memory.
- Fixes `--reserve-memory` at 0, as Kubernetes already manages container overhead using https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits[resource requests and limits^]. This simplifies memory allocation and ensures predictable resource usage.
- Configures Kubernetes resource requests for memory, enabling Kubernetes to effectively schedule and enforce memory allocation for containers.

CAUTION: Avoid manually setting Seastar flags unless absolutely necessary. Incorrect values can lead to performance issues, instability, or data loss. If you need to set these flags, use xref:reference:k-redpanda-helm-spec.adoc#statefulset-additionalredpandacmdflags[`statefulset.additionalRedpandaCmdFlags`].

[tabs]
======
Expand All @@ -52,10 +72,9 @@ spec:
resources:
memory:
enable_memory_locking: true <1>
container:
# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits)
# min:
max: <number><unit> <2>
requests:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resources.{requests,limits} is mutually exclusive with resources.{memory,cpu,container}. If requests and limits are provided, memory, cpu, and container, will be ignored.

Memory locking can be enabled with either statefulset.additionalRedpandaCmdFlags or some values in config. I know the key is enable_memory_locking but I'm not sure which stanza it goes into.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our existing doc already mentioned resources.memory.enable_memory_locking. Should we keep that as a recommended way of setting memory locking? Or is it better to use statefulset.additionalRedpandaCmdFlags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any baremetal docs that mention memory locking? If so, I'd vote to align with those be it the CLI flag or the rpk config.

If not, I guess it's dealers choice? I don't like resources.memory.enable_memory_locking because it's specific to kubernetes. I like to keep interfaces as consistent as possible so transitioning from baremetal to k8s or visa versa will lead to users thinking: "<blank> is a CLI flag/RPK Config. Okay, how do I set that in Kubernetes/Baremetal" rather than needing to have a bespoke answer for every piece of configuration.

# Allocates 90% to the --memory Seastar flag
memory: <number><unit> <2>
----

```bash
Expand All @@ -76,10 +95,9 @@ Helm::
resources:
memory:
enable_memory_locking: true <1>
container:
# If omitted, the `min` value is equal to the `max` value (requested resources defaults to limits)
# min:
max: <number><unit> <2>
requests:
# Allocates 90% to the --memory Seastar flag
memory: <number><unit> <2>
----
+
```bash
Expand All @@ -92,15 +110,16 @@ helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --crea
```bash
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set resources.memory.enable_memory_locking=true \ <1>
--set resources.memory.container.max=<number><unit> <2>
--set resources.requests.memory=<number><unit> <2>
```

====
--
======

<1> For production, enable memory locking to prevent the operating system from paging out Redpanda's memory to disk, which can significantly impact performance.
<2> The amount of memory to give Redpanda, Seastar, and the other container processes. You should give Redpanda at least 2 Gi of memory per core. Given that the Helm chart allocates 80% of the container's memory to Redpanda, leaving the rest for the Seastar subsystem and other processes, set this value to at least 2.5 Gi per core to ensure Redpanda has a full 2 Gi. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^].
<1> Enabling memory locking prevents the operating system from paging out Redpanda's memory to disk. This can significantly improve performance by ensuring Redpanda has uninterrupted access to its allocated memory.

<2> Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the `--memory` flag. Redpanda supports the following memory resource units: B, K, M, G, Ki, Mi, and Gi. Memory units are converted to the nearest whole MiB. For a description of memory resource units, see the https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory[Kubernetes documentation^].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory units are converted to the nearest whole MiB.

nit: I'm not 100% certain how to best phrase this because I'd describe it in code 😓 but memory units are floored or truncated to MiB. Rounded down, maybe?

Allocate at least 2.5 Gi of memory per core to ensure Redpanda has the 2 Gi per core it requires after accounting for the 90% allocation to the --memory flag.

This math doesn't add up. Wouldn't that be ~2.2Gi per core? Or do we want there to be 2.5Gi per core intentionally including the overhead? (This may be a good question for core performance)


[[qos]]
== Quality of service and resource guarantees
Expand Down Expand Up @@ -129,12 +148,12 @@ spec:
chartRef: {}
clusterSpec:
resources:
cpu:
cores: <number-of-cpu-cores>
memory:
container:
min: <redpanda-container-memory>
max: <redpanda-container-memory>
requests:
cpu: <number-of-cpu-cores>
memory: <redpanda-container-memory>
limits:
cpu: <number-of-cpu-cores> # Matches the request
memory: <redpanda-container-memory> # Matches the request
statefulset:
sideCars:
configWatcher:
Expand Down Expand Up @@ -188,12 +207,12 @@ Helm::
[,yaml]
----
resources:
cpu:
cores: <number-of-cpu-cores>
memory:
container:
min: <redpanda-container-memory>
max: <redpanda-container-memory>
requests:
cpu: <number-of-cpu-cores>
memory: <redpanda-container-memory>
limits:
cpu: <number-of-cpu-cores> # Matches the request
memory: <redpanda-container-memory> # Matches the request
statefulset:
sideCars:
configWatcher:
Expand Down Expand Up @@ -240,9 +259,10 @@ helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --crea
+
```bash
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set resources.cpu.cores=<number-of-cpu-cores> \
--set resources.memory.container.min=<redpanda-container-memory> \
--set resources.memory.container.max=<redpanda-container-memory> \
--set resources.requests.cpu=<number-of-cpu-cores> \
--set resources.limits.cpu=<number-of-cpu-cores> \
--set resources.requests.memory=<redpanda-container-memory> \
--set resources.limits.memory=<redpanda-container-memory> \
--set statefulset.sideCars.configWatcher.resources.requests.cpu=<redpanda-sidecar-container-cpu> \
--set statefulset.sideCars.configWatcher.resources.requests.memory=<redpanda-sidecar-container-memory> \
--set statefulset.sideCars.configWatcher.resources.limits.cpu=<redpanda-sidecar-container-cpu> \
Expand Down Expand Up @@ -283,7 +303,9 @@ If you use PersistentVolumes, you can set the storage capacity for each volume.

If Redpanda runs in a shared environment, where multiple applications run on the same worker node, you can make Redpanda less aggressive in CPU usage by enabling overprovisioning. This adjustment ensures a fairer distribution of CPU time among all processes, improving overall system efficiency at the cost of Redpanda's performance.

You can enable overprovisioning by either setting the CPU request to a fractional value or setting `overprovisioned` to `true`.
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: clearify that fractional must be < 1 for overprovisioned to be set

Suggested change
You can enable overprovisioning by either setting the CPU request to a fractional value or setting `resources.cpu.overprovisioned` to `true`.
You can enable overprovisioning by either setting the CPU request to a fractional value less than 1 or setting `resources.cpu.overprovisioned` to `true`.

And similarly to the comment above, overprovisioned can be set with either a CLI flag or an RPK config as well. Let's align on either CLI flags or RPK config values for this as well.


NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is that both requests and limits are set:

Suggested change
NOTE: When `resources.requests` or `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.
NOTE: When `resources.requests` and `resources.limits` are set, the `resources.cpu` parameter (including cores) is ignored. Ensure that you have not configured CPU requests and limits explicitly to avoid unexpected behavior in shared environments.

I'm not sure what the final sentence here means?


[tabs]
======
Expand Down
Loading