-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce single_char_pattern
to only lint on ascii chars
#11852
Conversation
r? @Alexendoo (rustbot has picked a reviewer for you, use r? to override) |
Maybe it's time to retire this lint? Which is faster seems unpredictable and subject to change across versions, it seems like a better avenue would be opening issues in rustc for any case where one is slower than the other |
clippy_lints/src/methods/mod.rs
Outdated
/// Performing these methods using a `char` can be faster than | ||
/// using a `str` because it needs one less indirection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true in the current implementation. If you want to be implementation agnostic, you might as well write the opposite:
/// Performing these methods using a `char` can be faster than | |
/// using a `str` because it needs one less indirection. | |
/// Performing these methods using a `str` can be faster than | |
/// using a `char` because it needs one less conversion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It apparently depends on a lot of factors. The char
based version stores the UTF8 data inline, whereas the &str
one stores a ref to the caller-supplied string slice instead, so if the Searcher instantiation is const, it gets by with one less pointer.
Of course e.g. the fastest starts_with
for an ascii character is haystack.as_bytes().get(0) == Some(&(c as u8))
, dropping below a nanosecond in my benchmarks.
At least I think we should move it to |
d69a594
to
3d98b67
Compare
☔ The latest upstream changes (presumably #12030) made this pull request unmergeable. Please resolve the merge conflicts. |
@Alexendoo should I rebase or close? |
Sorry for the delay, given the decision to move it to pedantic do we still want to make the multibyte change? If we're dropping the perf aspect of the lint to be more a stylistic one I think it would make sense to lint multibyte characters still & modify the lint description to be less perf focused |
The problem I see with that is that for multibyte inputs using a char may actually hurt performance. So yes, I believe we should reduce the lint scope even as we move it to pedantic. |
I'll take another look during the next week. |
Ignoring multibyte chars as a known issue sounds reasonable to avoid a perf regression if it's still there, ideally we'd have a rustc issue to link to I still think the description should be changed though, the perf angle doesn't hold up to me |
3d98b67
to
1a69f84
Compare
1a69f84
to
54de78a
Compare
@Alexendoo / @xFrednet I rebased the implementation and updated the docs. r? |
Looks good to me. Thank you for the update :) @bors r+ |
☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test |
This should mostly fix the
single_char_pattern
lint, because with a single byte, the optimizer will usually see through the char-to-string-expansion and single loop iteration. This fixes #11675 and #8111.Update: As per the meeting on November 28th, 2023, we voted to also downgrade the lint to pedantic.
changelog: downgrade [
single_char_pattern
] topedantic