Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero-copy ByteArray to Buffer for reading purpose #1485

Open
Chuckame opened this issue May 29, 2024 · 7 comments
Open

zero-copy ByteArray to Buffer for reading purpose #1485

Chuckame opened this issue May 29, 2024 · 7 comments

Comments

@Chuckame
Copy link

Chuckame commented May 29, 2024

Context

I'm currently the maintainer of avro4k and I'm planning to use okio for kicking out java streams and hoping a day to be multiplatform.

A lot of apps/libs/frameworks are dealing only with ByteArray (I'm not saying it's a good idea though). On our side, in the avro world, and especially in the messaging world (kafka, rabbitmq, ...) everything is using a ByteArray and we have no room for improvement to use ByteBuffer or even okio's Buffer.

We can encode easily data to a Buffer then reading the content to a ByteArray.

But for decoding from a ByteArray, with okio, we only have to choice to first copy the content to a Buffer and then decode, that is really bad regarding performances.

By the way, we are not using directly Buffer but BufferedSink and BufferedSource for this really great encoding/decoding API, but sadly those interfaces are sealed.

Proposal

A constructor of BufferedSource that takes a ByteArray to allow reading "complex" values (readLongLe, readUtf8, ...) over a ByteArray

Non goal

Backing a Buffer with a ByteArray : #1360

@swankjesse
Copy link
Collaborator

@Chuckame
Copy link
Author

How to wrap the array using unsafe cursor? I only see a transfer method to still write bytes to the buffer.

Maybe by setting the bytes to data?

@swankjesse
Copy link
Collaborator

You’d use this API to get a ByteArray that you can write bytes into. (That is only useful if the APIs you’re interacting with let you provide the target byte arrays.)

@Chuckame
Copy link
Author

Ah ok, this is already ok for the encoding part.

I'm mainly talking about decoding data from a ByteArray using BufferedSink without copying the ByteArray

@Chuckame Chuckame changed the title zero-copy ByteArray to Buffer zero-copy ByteArray to Buffer for reading purpose May 30, 2024
@Chuckame
Copy link
Author

I would like something like BufferedSink.wrap(bytes)

@Chuckame
Copy link
Author

Chuckame commented Jul 2, 2024

@swankjesse do you have a solution for reading from a ByteArray without copying ?

@JakeWharton
Copy link
Collaborator

The problem is the the "Buffered" in BufferedSink comes from its use of Buffer to implement intermediate storage for higher-level APIs than a raw Sink. And that Buffer is exposed in the API, so you can't "just" wrap a ByteArray with an index pointer or something.

Now looking at Buffer, its backing ByteArrays are held in Segments. We could probably implement something like Buffer.unsafeCreateFromByteArray (or maybe on UnsafeCursor so it's not on Buffer) which took a ByteArray and created a single Segment with it whose shared was set to true to prevent it from going into the pool and owner set to false to prevent writing to it.

So it seems technically possible, and has the same relative guarantees as UnsafeCursor usage. On the other hand, there's not a very high demand for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants