-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-8637]: storage
: fix race between disk_log_impl::new_segment()
and disk_log_impl::close()
#24635
base: dev
Are you sure you want to change the base?
Conversation
`disk_log_impl::close()` and `disk_log_impl::remove()` both set `_closed` to `true`. There is an assert in `new_segment()` which checks that the log is not closed under `_segments_rolling_lock`. However, neither `close()` or `remove()` previously respected this lock, which could lead to a race condition and a triggered assert, `"cannot add log segment to closed log"`. Await the `_segments_rolling_lock` in `close()` and `remove()`, and additional indicate it as `broken()` for future waiters before setting `_closed = true` to avoid this race.
storage
: fix race between disk_log_impl::new_segment()
and disk_log_impl::close()
storage
: fix race between disk_log_impl::new_segment()
and disk_log_impl::close()
Retry command for Build#60020please wait until all jobs are finished before running the slash command
|
CI test resultstest results on build#60020
test results on build#60032
|
Retry command for Build#60032please wait until all jobs are finished before running the slash command
|
@WillemKauf I'm wondering who initiates the segment roll while a log is being closed? Is there a missing higher level locking or bad management of resources? Is it possible than fixing the problem this way will just lead to another class of issues like asserts writing/reading from a closed log? |
It could be F1: Enters By waiting to obtain
I don't believe these changes would introduce those issues, but I'm also not going to claim they couldn't already exist as a separate race condition. Will have to re-read some code to form a concrete opinion here. |
disk_log_impl::close()
anddisk_log_impl::remove()
both set_closed
totrue
. There is an assert innew_segment()
which checks that the log is not closed under_segments_rolling_lock
. However, neitherclose()
orremove()
previously respected this lock, which could lead to a race condition and a triggered assert,"cannot add log segment to closed log"
.Await the
_segments_rolling_lock
inclose()
andremove()
, and indicate it asbroken()
for future waiters before setting_closed = true
to avoid this race.Backports Required
Release Notes
Bug Fixes
vassert()
.