Skip to content

Add cut-off mechanism for cell relocation to improve performance #77

@asonnino

Description

@asonnino

Problem

Currently, the cell relocation process processes all cells regardless of their deletion ratio or size, which can lead to inefficient processing of cells that have minimal benefit from relocation.

Proposed Solutions

Option 1: Deletion Ratio Cut-off

Add a cut-off at the cell level. If a cell has very few entries to remove (e.g., < 10% deletion ratio), skip relocating the entire cell. We can count bloom filter misses as a proxy for deletions:

let mut likely_removals = 0;
let total_entries = index.iter().count();

if let Some(bloom) = bloom_filters.get(&cell_ref.keyspace_desc.id()) {
    for (key, position) in index.iter() {
        let reduced_key = cell_ref.keyspace_desc.reduce_key(key);
        if !bloom.contains(&LargeTable::bloom_key(&reduced_key, *position)) {
            likely_removals += 1;
        }
    }

    let removal_ratio = likely_removals as f32 / total_entries as f32;
    if removal_ratio < 0.1 {
        return Ok(context); // Skip cell
    }
}

Option 2: Cell Size Threshold

Skip cells smaller than a certain threshold size:

if index.iter().count() < CELL_SIZE_THRESHOLD {
    // Skip this cell entirely
    return Ok(context);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions