Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: book keep dirty_ratio in disk_log_impl #24649

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

WillemKauf
Copy link
Contributor

@WillemKauf WillemKauf commented Dec 23, 2024

The dirty ratio of a log is defined as the ratio between the number of bytes in "dirty" segments and the total number of bytes in closed segments.

Dirty segments are closed segments which have not yet been cleanly compacted- i.e, duplicates for keys in this segment could be found in the prefix of the log up to this segment.

Add book-keeping to disk_log_impl in order to cache both _dirty_segment_bytes as well as _closed_segment_bytes, which allows us to calculate the dirty ratio, and add observability for it in storage::probe.

In the future, this could be used in combination with a compaction configuration a la min.cleanable.dirty.ratio to schedule compaction.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Improvements

  • Adds the observable metrics dirty_segment_bytes and closed_segment_bytes to the storage layer.

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 24, 2024

CI test results

test results on build#60092
test_id test_kind job_url test_status passed
gtest_raft_rpunit.gtest_raft_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60092#0193f576-772e-4d29-8e01-3035f1e9661c FLAKY 1/2
test results on build#60241
test_id test_kind job_url test_status passed
rptest.transactions.tx_atomic_produce_consume_test.TxAtomicProduceConsumeTest.test_basic_tx_consumer_transform_produce.with_failures=True ducktape https://buildkite.com/redpanda/redpanda/builds/60241#019429a2-9e8f-413c-87cf-33a7c04d8757 FAIL 0/1

Adds the book-keeping variables `_dirty/closed_segment_bytes` to
`disk_log_impl`, as well as some getter/setter functions.

These functions will be used throughout `disk_log_impl` where required
(segment rolling, compaction, segment eviction) to track the bytes
contained in dirty and closed segments.
Uses the added functions `update_dirty/closed_segment_bytes()`
in the required locations within `disk_log_impl` in order
to bookkeep the dirty ratio.

Bytes can be either removed or added by rolling new segments,
compaction, and retention enforcement.
@WillemKauf WillemKauf force-pushed the dirty_ratio_compaction branch from 018f7a2 to 7bd15b2 Compare January 2, 2025 23:05
@WillemKauf WillemKauf requested review from dotnwat and andrwng January 2, 2025 23:06
@vbotbuildovich
Copy link
Collaborator

Retry command for Build#60241

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/transactions/tx_atomic_produce_consume_test.py::TxAtomicProduceConsumeTest.test_basic_tx_consumer_transform_produce@{"with_failures":true}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants