Tsavorite allocator - tighten the packing of pages #657
+624
−271
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a change to the Tsavorite allocator base to make its page-filling algorithm (primarily for TsavoriteLog) deterministic.
We change the allocator to enqueue at tail with the invariant that the first record of page
(p+1)
is guaranteed to have not fit in the empty space at the end of the previous page (p
).This allows replication to independently replay AOF records and guarantee that they fit on the AOF created on the secondary, exactly in the same way as they do on the primary. I.e., they will be in perfect lockstep.
Previously it was possible that two threads raced to add at the end of a page, the larger allocation won first and closed that page, but the smaller allocation won on the new page and populated the first entry with a record that could have fitted on the previous page. On the replica, this record would end up on the previous page, resulting in offset mismatch.
We already have a workaround in the replica replay logic to correct for this, but this PR makes that workaround unnecessary.
PR also includes an unrelated fix to Upsert logic from Ted: Fix InernalUpsert srcRecordInfo setting when found below ReadOnlyAddress