Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Resources are not properly isolated in Standalone mode #19245

Open
Li0k opened this issue Nov 4, 2024 · 3 comments
Open

Bug: Resources are not properly isolated in Standalone mode #19245

Li0k opened this issue Nov 4, 2024 · 3 comments
Labels
type/bug Something isn't working
Milestone

Comments

@Li0k
Copy link
Contributor

Li0k commented Nov 4, 2024

When RW starts in default mode, each component is deployed in a different pod and utilises separate memory.

For example

  • CN calculates the memory resources that hummock can use with the function `storage_memory_config
  • Compactor uses all the memory provided by the system by default and divides it between worker and cache.

However, this can lead to OOM in Standalone mode. In Standalone mode we don't isolate CPU and Memory resources, and competition for CPU is natural, but not for Memory.

@Li0k Li0k added type/feature type/bug Something isn't working labels Nov 4, 2024
@github-actions github-actions bot added this to the release-2.2 milestone Nov 4, 2024
@Li0k Li0k removed the type/feature label Nov 4, 2024
@Li0k
Copy link
Contributor Author

Li0k commented Nov 4, 2024

cc @kwannoel @hzxa21

@lmatz
Copy link
Contributor

lmatz commented Nov 5, 2024

I wonder:

  1. If there is a minimum memory requirement for the compactor
  2. if there is any relationship between the live CPU usage and the memory usage of the compactor, e.g. if the compactor is using 2CPU at the moment, is there a cap on the memory usage?

I am wondering if we also limit the CPU usage of compactor a bit to ensure stability at the cost of potentially leaving idle resources on the table, e.g. if 8CPU in total, then the compactor can take half = 4CPU at most.

@Li0k
Copy link
Contributor Author

Li0k commented Nov 5, 2024

I wonder:

  1. If there is a minimum memory requirement for the compactor
  2. if there is any relationship between the live CPU usage and the memory usage of the compactor, e.g. if the compactor is using 2CPU at the moment, is there a cap on the memory usage?

I am wondering if we also limit the CPU usage of compactor a bit to ensure stability at the cost of potentially leaving idle resources on the table, e.g. if 8CPU in total, then the compactor can take half = 4CPU at most.

  1. It relay on the config of sstable_size and block_size
let min_compactor_memory_limit_bytes = (storage_opts.sstable_size_mb * (1 << 20)
            + storage_opts.block_size_kb * (1 << 10))
            as u64;
  1. The number of tasks is limited by the number of available cpu cores, the more tasks the more memory is used. The amount of memory consumed by each task is related to the content of the task and there is no general formula for calculating this. (Do not isolation the cpu core).
  2. I have no bias for CPUs, relying on Tokio scheduling, cpu competition is fair and doesn't cause fatal problems.

Also, if we plan to deploy standalone on a high-spec machine (e.g. 64c 256GB), as you said, not having CPU isolation may cause stability issues, as it did in the affnity test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants