ByteSubstring: add an option to used an owned needle (not borrowed) #50

tsnoam · 2021-09-29T09:06:41Z

This PR is a suggestion. I have done a minimal implementation, just so we have something to discuss.

The problem I was facing was that I couldn't use a borrowed ByteSubstring. So, I've modified jetscii's code to accept a Cow object.

Let me know what you think and if there're are any more modifications you would like to see as part of this PR.

tsnoam · 2022-06-28T12:21:08Z

@shepmaster
kind ping on that.

i would be happy to rebase on current main but I would like to know if you'd be interested in this change

shepmaster · 2022-07-06T17:22:09Z

Oh, wow, I'm sorry that this totally dropped off my radar!

I think the idea has merit, but I think I would suggest a different implementation.

Right now, we have crate::ByteSubstring which delegates to crate::fallback::ByteSubstring and crate::simd::ByteSubstring. In order to support your goal, I'd suggest:

refactoring commit
- Stop storing the needle in {fallback,simd}::ByteSubstring, instead passing it into find.
- Start storing the needle in crate::ByteSubstring instead.
feature commit
- Add a new crate::OwnedByteSubstring that holds a Vec<u8>
- This calls into {fallback,simd}::ByteSubstring::find.

This way, you can choose if you need ownership or just a reference.

Does that make sense and seem plausible?

shepmaster · 2022-07-07T00:28:30Z

After musing on it for a while I wonder if we could change ByteSubstring to be generic with T: AsRef<[u8]>.

tsnoam · 2022-07-14T05:15:55Z

Hi @shepmaster
Sorry for the late response. Yes, I like your suggestion. I can contribute these changes and obviously base it on the latest "main" branch.

shepmaster · 2022-07-14T13:10:44Z

Sorry for the late response

i think you’ve got a few months longer before you need to start apologizing for anything like that 😅

tsnoam · 2022-08-25T14:35:10Z

@shepmaster I've updated the PR in accordance with our discussion:

ByteSubstring takes T: AsRef<[u8]
I also added what I thought was an appropriate way to do the same thing for Substring, without breaking ergonomics or backward compat

let me know what you think.

tsnoam · 2022-08-28T06:22:10Z

I just realized that I only compiled it on my macos m1. So only the fallback code was tested 🤦‍♂️

I will fix and update here...

tsnoam · 2022-08-28T09:18:41Z

Ok. Fixed.

@shepmaster would really love to see your comments.

tsnoam · 2022-09-05T09:59:23Z

@shepmaster kind ping on that

tsnoam · 2022-09-12T06:45:45Z

@shepmaster kind ping :)

shepmaster · 2022-09-13T17:59:56Z

I did one force-push to tidy up some formatting things, and a second force-push to rebase on the current main branch (which has some soundness fixes).

Overall, things seem reasonable, but I'm a bit hesitant about the dispatch macro vs enum solution. The new code has to check the enum discriminant each call (e.g. to find). The old code would use the feature test, which I believe would do some trickery to perform the test just once and then effectively set a global value that could be checked in the future. I'll need to do some performance tests.

Have you tried this code locally with a String? Does it work for your original case?

This allows to either own (consume) the needle or keep a reference to it.

This gains two purposes: 1. Fix ownership issue of trying to own needle `T` by both simd & fallback modules. 2. Simplifies the runtime checks done by the code when running in `maybe` mode.

tsnoam · 2022-09-14T12:34:08Z

Hi @shepmaster

Thanks for the review.

ACK for the formatting and rebasing over master.
Basically, I wanted to hear your initial feedback about the code before working harder to make things perfect ;)
However, I did not realize that unitests should be executed with all features turned on (I did that and fixed the Substring code).
I did notice that Substring might have backward-incompatible type definition. I'm not 100% sure that this is the way to go, but I think it is quite clean and obvious, will be glad to know what you think of it.
Regarding the dispatch macro: there is no indication in rust documentation that this is cached in any way ( https://doc.rust-lang.org/std/macro.is_x86_feature_detected.html ), so I decided to check it out - first by benchmarks (I couldn't find any significant difference in either direction) but then I disassembled the compiled executable (using objdump -d). The new approach was shorter in about 50% for the ByteSubstring::find code.

Bottom line, I fixed the code a little more (now for all features), benchmarks are running properly - and obviously the new code was forced pushed into the branch.

shepmaster · 2022-09-14T14:41:54Z

I don't understand exactly how, but the benchmarks all show a marked decrease in speed with this branch. That's even including the benchmarks for the standard library versions, which obviously shouldn't change!

 name                                         before ns/iter         after ns/iter          diff ns/iter  diff %  speedup 
 bench::space_ascii_chars                     356,001 (14727 MB/s)   362,121 (14478 MB/s)          6,120   1.72%   x 0.98 
 bench::space_stdlib_find_char                279,225 (18776 MB/s)   278,723 (18810 MB/s)           -502  -0.18%   x 1.00 
 bench::space_stdlib_find_char_set            2,608,215 (2010 MB/s)  2,626,160 (1996 MB/s)        17,945   0.69%   x 0.99 
 bench::space_stdlib_find_closure             2,318,368 (2261 MB/s)  2,390,488 (2193 MB/s)        72,120   3.11%   x 0.97 
 bench::space_stdlib_find_string              2,153,673 (2434 MB/s)  2,268,436 (2311 MB/s)       114,763   5.33%   x 0.95 
 bench::space_stdlib_iterator_position        1,160,877 (4516 MB/s)  1,198,881 (4373 MB/s)        38,004   3.27%   x 0.97 
 bench::substring_stdlib_find                 463,832 (11303 MB/s)   469,920 (11156 MB/s)          6,088   1.31%   x 0.99 
 bench::substring_with_created_searcher       358,074 (14641 MB/s)   361,645 (14497 MB/s)          3,571   1.00%   x 0.99 
 bench::xml_delim_3_ascii_chars               361,004 (14523 MB/s)   362,989 (14443 MB/s)          1,985   0.55%   x 0.99 
 bench::xml_delim_3_stdlib_find_char_closure  3,516,929 (1490 MB/s)  3,459,551 (1515 MB/s)       -57,378  -1.63%   x 1.02 
 bench::xml_delim_3_stdlib_find_char_set      2,612,870 (2006 MB/s)  3,456,608 (1516 MB/s)       843,738  32.29%   x 0.76 
 bench::xml_delim_3_stdlib_iterator_position  1,131,135 (4635 MB/s)  1,155,312 (4538 MB/s)        24,177   2.14%   x 0.98 
 bench::xml_delim_5_ascii_chars               360,298 (14551 MB/s)   360,735 (14533 MB/s)            437   0.12%   x 1.00 
 bench::xml_delim_5_stdlib_find_char_closure  2,715,305 (1930 MB/s)  3,458,322 (1516 MB/s)       743,017  27.36%   x 0.79 
 bench::xml_delim_5_stdlib_find_char_set      3,449,795 (1519 MB/s)  3,473,528 (1509 MB/s)        23,733   0.69%   x 0.99 
 bench::xml_delim_5_stdlib_iterator_position  1,138,696 (4604 MB/s)  1,158,930 (4523 MB/s)        20,234   1.78%   x 0.98

shepmaster · 2022-09-14T14:44:27Z

might have backward-incompatible type definition

Yeah, I think that this change will warrant a new semver-incompatible version to be released; that's not a big deal IMO.

there is no indication in rust documentation that this is cached in any way

Yeah, I read through that before posting my comment (that's what led to the "I believe"). It must have been made up in my head.

tsnoam · 2022-09-14T15:08:22Z

I don't understand exactly how, but the benchmarks all show a marked decrease in speed with this branch. That's even including the benchmarks for the standard library versions, which obviously shouldn't change!

That is to be expected. Benchmarks will usually show differences of +-3% depending on their nature and whatever the benchmarking server is doing at the moment.

What I usually did when I was running benchmarks using nightly-rust was to externally run them again and again and look at the avg/max/min of each individual benchmark.

If you would like, it is possible to migrate the benchmarks to use criterion, which saves you the hassle of what described above. There is also a nice tool called critcmp which helps in comparing the results of two different criterion executions.

tsnoam · 2022-09-14T15:09:00Z

oh, and criterion also runs with stable rust, which is a bonus...

tsnoam · 2022-09-14T15:12:59Z

sorry, i'm looking again at the stdlib difference, and it is weird. too high of a diff.

let me look into it a little more.

tsnoam force-pushed the bytesubstring_owned branch from e9cb819 to 19b5a06 Compare September 29, 2021 09:07

tsnoam force-pushed the bytesubstring_owned branch from 19b5a06 to eb07c4b Compare August 25, 2022 14:31

shepmaster force-pushed the bytesubstring_owned branch 2 times, most recently from 4c42031 to 1248666 Compare September 13, 2022 17:55

shepmaster and others added 6 commits September 14, 2022 13:08

Avoid unused #[macro_use] import warning

9fa2a15

ByteSubstring now takes the needle as generic type T instead of &[u8]

cc0040a

This allows to either own (consume) the needle or keep a reference to it.

Substring now takes the needle as generic type T instead of &str

b416d63

This allows to either own (consume) the needle or keep a reference to it.

Public APIs ByteSubstring & Bytes are now enums

18d0e09

This gains two purposes: 1. Fix ownership issue of trying to own needle `T` by both simd & fallback modules. 2. Simplifies the runtime checks done by the code when running in `maybe` mode.

Correct typos in the docs

c4ab3a2

Add convenience types for Substring

a420c3f

tsnoam force-pushed the bytesubstring_owned branch from 1248666 to a420c3f Compare September 14, 2022 10:08

build.rs: rerun if build.rs, target-arch or target-feature changed

aca26bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ByteSubstring: add an option to used an owned needle (not borrowed) #50

ByteSubstring: add an option to used an owned needle (not borrowed) #50

tsnoam commented Sep 29, 2021

tsnoam commented Jun 28, 2022

shepmaster commented Jul 6, 2022 •

edited

Loading

shepmaster commented Jul 7, 2022

tsnoam commented Jul 14, 2022

shepmaster commented Jul 14, 2022

tsnoam commented Aug 25, 2022

tsnoam commented Aug 28, 2022 •

edited

Loading

tsnoam commented Aug 28, 2022

tsnoam commented Sep 5, 2022

tsnoam commented Sep 12, 2022

shepmaster commented Sep 13, 2022

tsnoam commented Sep 14, 2022

shepmaster commented Sep 14, 2022

shepmaster commented Sep 14, 2022

tsnoam commented Sep 14, 2022

tsnoam commented Sep 14, 2022

tsnoam commented Sep 14, 2022

ByteSubstring: add an option to used an owned needle (not borrowed) #50

Are you sure you want to change the base?

ByteSubstring: add an option to used an owned needle (not borrowed) #50

Conversation

tsnoam commented Sep 29, 2021

tsnoam commented Jun 28, 2022

shepmaster commented Jul 6, 2022 • edited Loading

shepmaster commented Jul 7, 2022

tsnoam commented Jul 14, 2022

shepmaster commented Jul 14, 2022

tsnoam commented Aug 25, 2022

tsnoam commented Aug 28, 2022 • edited Loading

tsnoam commented Aug 28, 2022

tsnoam commented Sep 5, 2022

tsnoam commented Sep 12, 2022

shepmaster commented Sep 13, 2022

tsnoam commented Sep 14, 2022

shepmaster commented Sep 14, 2022

shepmaster commented Sep 14, 2022

tsnoam commented Sep 14, 2022

tsnoam commented Sep 14, 2022

tsnoam commented Sep 14, 2022

shepmaster commented Jul 6, 2022 •

edited

Loading

tsnoam commented Aug 28, 2022 •

edited

Loading