Skip to content

Conversation

Tim-Brooks
Copy link
Contributor

This change makes a number of minor optimizations to the recycler bytes
stream. The most important change is that it allow cached direct access
to the current page. This helps in most scenarios where are write does
not cross page boundaries.

Additionally, it enables future subclasses to implement custom
serialization directly to the page with minimal bounds checks.

Finally, it always creates the first page in the ctor to remove guaranteed
expand calls in the first stream write.

@Tim-Brooks Tim-Brooks requested a review from a team as a code owner September 19, 2025 22:47
@Tim-Brooks Tim-Brooks added >non-issue :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v9.2.0 labels Sep 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Indexing Meta label for Distributed Indexing team label Sep 19, 2025
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
@Fork(value = 1)
public class RecyclerBytesStreamBenchmark {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a microbenchmark to allow testing the methods I was working on.

this.currentPageOffset = pageSize;
// Always start with a page
ensureCapacityFromPosition(1);
nextPage();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always start with an initial page to remove the requirement that the page allocation is in one of the write methods otherwise.

if (1 > (pageSize - currentPageOffset)) {
if (1 > pageSize - currentPageOffset) {
ensureCapacity(1);
nextPage();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change because ensureCapacity now JUST allocates sufficient pages. Does not adjust write location.

ensureCapacity(length);
currentPageOffset = this.currentPageOffset;
writeMultiplePages(b, offset, length);
} else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically add a branch for the extremely common scenario where the bytes fit in the current page.

}

@Override
public void writeVInt(int i) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vInt is very commonly written (header for bytes reference, map item count, etc, etc). Since it is pretty expensive to write we create a specific variant that works for the recycler stream variant.

if (offsetInPage == 0) {
this.pageIndex = pageIndex - 1;
this.currentPageOffset = pageSize;
if (position > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is the same except we need special handling for position 0 since the stream always has at least one page.

out.writeByte(b);
assertEquals(b, out.bytes().get(PageCacheRecycler.BYTE_PAGE_SIZE));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >non-issue Team:Distributed Indexing Meta label for Distributed Indexing team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants