Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Reduce concurrent conflicts between block write operations and poll operations #1550

Open
CLFutureX opened this issue Jul 11, 2024 · 3 comments
Labels

Comments

@CLFutureX
Copy link
Contributor

CLFutureX commented Jul 11, 2024

Background:
Currently, there exists intense concurrent competition between the write and poll operations of the block, which affects the write performance of the Write-Ahead Logging (WAL).
path: com.automq.stream.s3.wal.impl.block.SlidingWindowService

Current Status:
During the current block write process,

  1. A block is first acquired, and any fully written blocks are added to pendingBlocks.
  2. Subsequently, an attempt is made to poll a ready block and hand it over to the IO thread pool for data writing. This operation has a cool-down time set to a default of 1/3000 seconds.
  3. Additionally, there is a separate single-threaded scheduled thread pool with a corresponding scheduled task performing the same operation as described in step 2, but with a default interval of 1/1000 seconds.
  4. IO threads complete the writing of blocks and update writeBlock.

Conflict Points:

  1. Since both Step 1 ,Step 2 and Step 4 acquire the same lock, blockLock, it inevitably leads to conflicts between Step 1 , Step 2 and Step4 regardless of the circumstances.
  2. Additionally, due to the fact that each block write operation proactively attempts to execute Step 2, this creates conflicts between the current write thread and the scheduled thread's operations. This, in turn, further intensifies the conflict between write operations and poll operations.

Solution:
Optimization Approach: The write and poll operations should be separated to minimize concurrent conflicts.

1. Lock Separation Optimization

To separate the conflicts between writing blocks and polling blocks, lock separation can be implemented. For the polling operation, a separate lock, pollBlockLock, can be set up.

2. Shared Resource Handling:

After implementing lock separation, the next challenge is managing shared resources.

  • pendingBlocks: Both writing and polling involve modifications to pendingBlocks. Therefore, pendingBlocks should be implemented as a thread-safe queue, such as LinkedBlockingQueue.

  • currentBlock: Currently, both writing and polling involve accessing the current block, leading to inevitable conflicts between the two processes.
    To optimize this, a batching time can be introduced, which can be set to the current minWriteIntervalNanos. This way, during polling, a decision can be made based on time whether to include the currentBlock in the poll. If needed, an attempt to acquire blockLock is made, potentially causing a conflict; otherwise, no conflict arises.

  • writeBlocks: Currently, the primary role of writeBlocks is to update the startOffset of WindowCoreData. For
    writeBlocks, it is crucial to ensure the orderliness of the internal blocks.
    The current blockLock + pollBlockLock mechanism ensures the ordering of blocks within writeBlocks. As a preliminary solution, converting writeBlocks into a blocking queue seems feasible.
    When writeBlocks is not empty, the ordering can indeed be guaranteed. However, when writeBlocks is empty, how can we obtain the minimum startOffset currently written to writeBlock (or, equivalently, the maximum offset of the already written blocks)?
    Previously, due to the global blockLock, when writeBlocks was empty, we could simply retrieve the information from currentBlock.
    Currently, without the global lock, the preliminary solution involves acquiring blockLock + pollBlockLock when writeBlocks is empty.
    However, it is evident that this will introduce concurrency issues between step 4 and step 1 and 2 whenever writeBlocks is empty.

    How can we optimize this situation?
    There are two approaches: eventual consistency and strict consistency.
    eventual consistency
    When wroteBlocks is empty, we can directly calculate the offset based on the offset of the currently wroteBlock:
    offset = wroteBlocks.startOffset() + WALUtil.alignLargeByBlockSize(wroteBlocks.blockBatchSize()) to update the
    position accordingly.
    If the current block is indeed the one with the largest startOffset, then updating the offset in this way poses no
    issue.
    However, if that's not the case, for example, if block4 and block5 have already been written before, and now
    block3 is being written (in an out-of-order manner), after block3 is written, writingBlock becomes empty, and at this
    point, WindowCoreData's startOffset might be incorrectly updated to offset1. The correct update should be to
    offset2.
    image
    When would the update to the latest offset occur to maintain consistency?
    The update to the latest offset will occur when the next IO operation writes a new block, and the update is
    performed again at that time. This ensures that the startOffset reflects the most recent and accurate state of the
    written blocks.
    image
    strict consistency
    In the poll operation, keep track of the maximum offset that has been written to the block, denoted as
    maxWriteOffset. When updating, if writeBlocks is empty, it indicates that the current block has been fully written.
    Therefore, the offset can be set to maxWriteOffset.

3. Locking Optimization:

Consider changing the current blocking lock acquisition to a try-lock mechanism. If the write thread successfully acquires the lock, it proceeds with its operation. If the write thread fails to acquire the lock, it means that the poll thread is currently processing, and the write thread can simply return without further action.

@Chillax-0v0
Copy link
Contributor

After implementing lock separation, the next challenge is managing shared resources.

Currently, writingBlocks is also protected by the blockLock. How should it be handled?

CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 11, 2024
CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 11, 2024
@CLFutureX
Copy link
Contributor Author

  • writeBlocks

After implementing lock separation, the next challenge is managing shared resources.

Currently, writingBlocks is also protected by the blockLock. How should it be handled?

hey, I have updated the documentation. Please review the proposed handling scheme for writeBlocks mentioned above.

@Chillax-0v0
Copy link
Contributor

Great job, the explanation is very detailed.

Among the two approaches in writingBlocks (named "eventual consistency" and "strict consistency"), I prefer the second one. There are two reasons for this:

  • WindowCoreData#startOffset will be used for validation before BlockWALService#trim. Specifically, it will check whether the "trim offset" exceeds the "start offset", and if it does, an error will be reported. If the first approach is used, the following scenario might occur:
    Both Block-1{start=10, end=20} and Block-2{start=20, end=30} have been written successfully. The upper layer calls trim(Block-2) (in other words trim(20)). At this time, WindowCoreData#startOffset might be equal to 20 instead of 30, which would cause the trim to fail.
  • WindowCoreData#startOffset will also be used in AppendResult#CallbackResult#flushedOffset, which will be exposed externally. I think it is not appropriate to change its meaning (from "strict consistency" to "eventual consistency")

CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 12, 2024
CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 12, 2024
CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 12, 2024
CLFutureX added a commit to CLFutureX/automq that referenced this issue Jul 17, 2024
@github-actions github-actions bot added the Stale label Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants