Skip to content

Commit

Permalink
Add inline to some small functions used in sse impl
Browse files Browse the repository at this point in the history
This speeds up the criteron benchmarks by almost 2x

I believe this is needed because e.g. Bytes::find is inlined, and calls `find`
generically, which will call PackedCompareControl methods. So the code calling
the methods will be inlined into the calling crate, but the implemetations of
the PackedCompareControl are not accessable to the code in the calling crate,
so they will end up as actual function calls. However these functions are
_super_ simple, and inlining them helps a LOT, so adding `#[inline]` to these
functions, and making their implementation available to calling crates has a
huge effect.

This was only seen when moving to criterion because previously, nightly
benchmarks were implemented in the library crate itself, and so these functions
were already elegable for inlining. Criteron results were actually more
accurate to what callers of the crate would actually see!
  • Loading branch information
Dr-Emann committed Oct 14, 2023
1 parent 7800819 commit c8339a2
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions src/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -252,9 +252,12 @@ impl Bytes {
}

impl<'b> PackedCompareControl for &'b Bytes {
#[inline]
fn needle(&self) -> __m128i {
self.needle
}

#[inline]
fn needle_len(&self) -> i32 {
self.needle_len
}
Expand Down Expand Up @@ -312,9 +315,12 @@ impl<'a> ByteSubstring<'a> {
}

impl<'a, 'b> PackedCompareControl for &'b ByteSubstring<'a> {
#[inline]
fn needle(&self) -> __m128i {
self.needle
}

#[inline]
fn needle_len(&self) -> i32 {
self.needle_len
}
Expand Down

0 comments on commit c8339a2

Please sign in to comment.