Skip to content

[Feature] Support Adaptive Memtable Flush Thread Pool Adjustment #60616

@sollhui

Description

@sollhui

Search before asking

  • I had searched in the issues and found no similar issues.

Description

Backgroud

Currently, the number of Memtable flush threads in Doris needs to be manually adjusted by users/SRE/R&D personnel in different business load scenarios. This manual tuning brings heavy operational overhead, and improper configuration may lead to resource competition (CPU/IO bottlenecks), excessive small file generation, or memtable memory overflow risks, which seriously affects the stability and performance of the Doris cluster.

Implement an adaptive Memtable flush thread pool adjustment mechanism that dynamically calculates and modifies the maximum concurrent flush thread count based on real-time cluster load metrics. The core design is optimized for Doris's storage-compute integrated and storage-compute separated deployment architectures.

Core Design Details

1. Real-time Load Metric Collection

Collect multi-dimensional metrics periodically to reflect the current system state (all metrics are atomic and thread-safe for collection):

  • Memtable total memory usage (monitor soft/hard memory limit thresholds)
  • Memtable flush task queue backlog size
  • Disk IO busy status (differentiated implementation for deployment architectures)
    • Storage-compute integrated: Judge by disk IO util metric
    • Storage-compute separated (S3/HDFS): Judge by the queue length of S3 write thread pool
  • CPU usage rate of BE nodes (avoid CPU context switch overhead caused by excessive flush threads)

2. Adaptive Flush Thread Count Calculation

Add a dedicated calculation logic (executed every 1 minute) to dynamically adjust the base concurrent flush thread count, with upper/lower limits to avoid extreme values. The core judgment rules are:

int CalculateMaxConcurrentFlush() { 
    // Condition 1: Memory reaches soft limit -> +1
    if (_memory_limiter != nullptr && _memory_limiter->mem_usage() > 0) {
        base_concurrent = std::min(max_threads, base_concurrent + 1);
    }
    // Condition 2: Flush queue > 10 -> +1
    int queue_size = _flush_pool->get_queue_size();
    if (queue_size > kFlushQueueThreshold) {
        base_concurrent = std::min(max_threads, base_concurrent + 1);
    }
    // Condition 3: IO busy -> -1
    // For compute-storage integrated: disk IO util > 90%
    // For compute-storage separated (cloud): S3 upload queue > threshold
    if (_is_io_busy()) {
        base_concurrent = std::max(min_threads, base_concurrent - 1);
    }
    // Condition 4: CPU usage > 90% -> -1
    if (_is_cpu_busy()) {
        base_concurrent = std::max(min_threads, base_concurrent - 1);
    }
    ...
}

3. Flush Memtable Thread Count

For the storage-compute separated deployment mode, where there is no direct disk IO interaction, disk metrics are no longer a consideration; we thus optimize and unify the calculation of thread count limits based on the number of CPU cores for both storage-compute integrated and storage-compute separated deployment architectures:

  • Minimum thread count: num_cpus * config::min_flush_thread_num_per_cpu (default: 1/2 per CPU)
  • Maximum thread count: num_cpus * config::max_flush_thread_num_per_cpu (default: 4 per CPU)

4. Bad Case and Mitigation

  • The mixed scenario (e.g., large query + data import) may lead to high CPU/IO usage, causing the number of flush threads to continuously decrease, eventually reaching a relatively low value. Set a reasonable minimum thread count to avoid continuous thread reduction and prevent flush task backlog.
  • Introducing a value that is not the smallest minimum may lead to a significant amount of idle flush thread pool when The write load is relatively low.

5. Extensible Class Design

Introduce a new generic adaptive configuration class to manage the flush thread pool parameters, which supports future expansion for adaptive tuning of other write module parameters in Doris (ensures code reusability).

Use case

No response

Related issues

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions