Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ToReadOnlySequence() to ByteString #7491

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

to11mtm
Copy link
Member

@to11mtm to11mtm commented Jan 30, 2025

Changes

Adds public ReadOnlySequence<byte> ToReadOnlySequence() to ByteString.

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Latest dev Benchmarks

N/A

This PR's Benchmarks

N/A

Notes:

ReadOnlySequence is a bit of a tradeoff; On one hand, it is a larger struct (two objects and two ints) and does some fun typechecks against said objects which makes it not as fast as ReadOnlySpan.

On the other hand, for non-compacted ByteString, rather than requiring a full copy+alloc of new array, it's essentially a linked-list of the segments (each segment being a long of the total length of previous segments, a Memory<T> and the next ReadOnlySequenceSegment).

(For already 'compacted' ByteStrings, no allocation is required since ReadOnlySequence has logic to to avoid allocation for a single ReadOnlyMemory being passed in.)

The other advantage however, is that ReadOnlySequence is *notaref struct`.

This makes it easier to do certain user code (i.e. not having to worry about ref struct rules as much, e.x. including in a class being passed as a message.

As far as ecosystem support, I know MessagePack-CSharp supports ReadOnlySequence<byte> for input, System.Text.Json should eventually include it based on dotnet/runtime#95187 / dotnet/runtime#67337, and Protobuf-Net appears to have support for it as well. Probably others too but those are the first that came to mind for serialization.

Comment on lines 50 to 55
if (bArr.Length == 2)
{
item = bArr[1];
last = new ByteStringReadOnlySequenceSegment(item, bs.Count-item.Count);
first.Next = last;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this might be premature optimization....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, please remove

Comment on lines 58 to 68
var prior = first;
for (int i = 1; i < bArr.Length; i++)
{
item = bArr[i];
var curr = new ByteStringReadOnlySequenceSegment(item, prior.RunningIndex+prior.Memory.Length);
prior.Next = curr;
prior = curr;
}
last = prior;
}
return new ReadOnlySequence<byte>(first,0,last,last.Memory.Length);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if I like this less or more than just 'going backwards on the array' and just building the Segments from last to first.

Happy for feedback (I suppose main thing being the pattern here vs setting next explicitly in ctor could impact JIT optimizations... but probably overthinking things a bit.)

@@ -28,6 +29,46 @@ namespace Akka.IO
[DebuggerDisplay("(Count = {_count}, Buffers = {_buffers})")]
public sealed class ByteString : IEquatable<ByteString>, IEnumerable<byte>
{
public class ByteStringReadOnlySequenceSegment : ReadOnlySequenceSegment<byte>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to put in class (or maybe even move to partial/etc?)

It's a nested class for the sake of being able to directly access _buffers in the create. OTOH if we 'flipped' the construction (see other comment) we wouldn't need to nest it and could just build in .ToReadOnlySequence().

... Either way this should be sealed though

@to11mtm to11mtm marked this pull request as ready for review February 1, 2025 18:25
@to11mtm
Copy link
Member Author

to11mtm commented Feb 1, 2025

Needs API review

@to11mtm
Copy link
Member Author

to11mtm commented Feb 1, 2025

All observed failures in tests appear to be races since they wouldn't be using these new APIs.

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are probably going to restructure ByteString to implement System.Memory directly, so anything that's more complicated than

public ReadOnlySequence<byte> ToReadOnlySequence()
        {
            if (_count == 0)
            {
                return ReadOnlySequence<byte>.Empty;
            }
            else if (_buffers.Length == 1)
            {
                //Happy path, we can just pass ArraySegment here and avoid alloc.
                return new ReadOnlySequence<byte>(_buffers[0]);
            }
            else
            {
                return ByteStringReadOnlySequenceSegment.CreateSequence(this);
            }
        }

needs to be deleted.

@to11mtm
Copy link
Member Author

to11mtm commented Feb 5, 2025

We are probably going to restructure ByteString to implement System.Memory directly, so anything that's more complicated than

needs to be deleted.

Did my best to simplify the creation path; If there's specific things to handle please LMK.

@Aaronontheweb
Copy link
Member

Thanks @to11mtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants