Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use *mut u8 to write FromHex bytes #123

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tankyleo
Copy link
Contributor

Please see https://github.com/tankyleo/hex-benchmarks for benchmark code.

Partially addresses #115

@tankyleo
Copy link
Contributor Author

tankyleo commented Oct 19, 2024

@RCasatta I try to address partially the issue you raised here, let me know what you think.

I am aware this is a lot of unsafe code that you might not be comfortable with. Happy to take a less aggressive approach.

@tcharding
Copy link
Member

Any chance you could explain why this is faster, IIUC you are replacing a call to collect with a manual implementation - that is a pretty low level thing to need to do, I would have thought the Rust language could handle that.

@tankyleo
Copy link
Contributor Author

Thank you for your question.

As far as I understand:

For a T: ExactSizeIterator, unsafe code "must not rely on the correctness of Iterator::size_hint": https://doc.rust-lang.org/stable/std/iter/trait.ExactSizeIterator.html

So while the collect call does allocate the total memory used in a single call, the hot loop that writes each element keeps checking that the vector has enough capacity to fit one more element:

https://github.com/rust-lang/rust/blob/fb32dd41ed278e2d691900248d94aa9bc4409f99/library/alloc/src/vec/mod.rs#L3510

Further below, you can see the implementation for T: TrustedLen:

https://github.com/rust-lang/rust/blob/fb32dd41ed278e2d691900248d94aa9bc4409f99/library/alloc/src/vec/mod.rs#L3538

The hot loop also bumps the length on each element write. See below:

@tankyleo
Copy link
Contributor Author

Does a u8 need to be "dropped" ?

At the moment, if we return an InvalidCharError, we drop a vector that has zero length, but we might have written a few u8's.

My guess is u8 does not need to be "dropped", so the current behavior there is ok.

@RCasatta
Copy link

Thanks @tankyleo, the performance achievement looks great! In particular in #115 most of the time is spent in decoding hex txids, thus it would gain the 2x speedup of fixed size array, Also unsafe code looks ok to me, I really would like this.

That said, I am no expert in unsafe code, and given the name conservative in the crate we should be really confident on this.

I think reasons explained in @tankyleo commente here should be added in code comments.

I would like benches in https://github.com/tankyleo/hex-benchmarks to be added in this crate one day (as a sub-crate)

@tcharding
Copy link
Member

Thanks, your function call chain looks correct. Its a bit sad that collect on a Vec does not seem to take ExactSizeIterator into consideration. Iterators are supposed to be pure magic, if they don't collect fast then they are not that magic.

@tcharding
Copy link
Member

Could maybe add a comment to save the next guy wondering why we implement collect manually.

Use your own words, but if you want you could use

        // Calling `collect` on `self` is slower than this manual implementation because Rust does not
        // seem to utilize the fact that `HexToBytesIter` implements `ExactSizeIterator`.

@tankyleo
Copy link
Contributor Author

Rebase: add a comment explaining why we implement collect manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants