-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run cargo update #6964
Run cargo update #6964
Conversation
CodSpeed Performance ReportMerging #6964 will degrade performances by 2.97%Comparing Summary
Benchmarks breakdown
|
Huh, could that possibly be correct? I'll have to run these manually. |
PR Check ResultsEcosystem✅ ecosystem check detected no changes. |
The reports have been very accurate this far. Maybe force push again to trigger a new run. I otherwise recommend upgrading in smaller batches (eg only memchar) |
I ran the benchmarks locally. All
|
It seems like libCST hits some bad case for aho-corasick For comparison, the where the Regex construction "only" accounts for 3% of the overall time The easiest fix is to change libCST to either cache the Regex in CC: @BurntSushi you might be interested in the regression. It seems that building a |
It seems this is "fixed" in the latest version of LibCST. |
I don't think this is related to the
I also tried to check whether it might be a change in the
It looks like @MichaReiser found that it might be an issue in LibCST? Basically, one possible cause for this is that something changed not in regex compilation but in the frequency of regex compilation. If that happened, you'd expect to see the most expensive parts of regex compilation pop up in a profile. |
Hmm, interesting find. We didn't update LibCST as part of this PR, so it must be a shared dependency OR we now run the LibCST code paths more often but I then would expect the ecosystem check to fail (because we have more or fewer diagnostics). Anyway, thanks for chiming in and the best way to narrow it down is probably to upgrade the dependencies incrementally. |
@MichaReiser Aye. Note that there were big changes to |
Woah nice memory improvements. Maybe we should do some memory profiling too ;) @BurntSushi I took the same steps you mentioned above but with the
But then, the main issue is the way LibCST uses/used a Regex (which ideally wouldn't be a regex in the first place) and updating / patching our branch is the way to go. |
It turns out that #121 introduced a relatively sizeable performance regression when building very small automatons. Namely, several of the steps in the construction process took worst case `O(n^2)` time, where `n` corresponds to the alphabet size (255 in this case). This ends up not being too awful when the automaton is big (a lot of patterns), but it adds fairly sizeable overhead in the case of small automatons. We fix this by making these methods take linear time instead. This makes things a little more complicated, and perhaps there is a better abstraction to make this simpler. This was found by Ruff's benchmark suite: astral-sh/ruff#6964
It turns out that #121 introduced a relatively sizeable performance regression when building very small automatons. Namely, several of the steps in the construction process took worst case `O(n^2)` time, where `n` corresponds to the alphabet size (255 in this case). This ends up not being too awful when the automaton is big (a lot of patterns), but it adds fairly sizeable overhead in the case of small automatons. We fix this by making these methods take linear time instead. This makes things a little more complicated, and perhaps there is a better abstraction to make this simpler. This was found by Ruff's benchmark suite: astral-sh/ruff#6964
Higher level: yeah totally agree that code should just be using To confirm it's fixed, starting from master:
Bump to
And now bump to recently released
Thanks for catching this! |
Oh wow, thank you and I'm glad we were able to help catching the regression. But I'm sorry that I pushed some extra work on your shoulders. I should have created a minimized repro benchmark and open an issue on the aho-corasick repository instead. Thanks again. Edit: Upstream patch for LibCST to remove the Regex use (there are more, but this replaces at least one) |
1ffbad8
to
11e1d17
Compare
That |
Ideally we shouldn't have to run `cargo update` manually — it requires us to remember to do so and groups all updates into a single pull request making it challenging to determine which upgrade introduces regressions e.g. #6964. Here we add daily checks for cargo dependency updates. This pull request also simplifies dependabot configuration for GitHub Actions versions.
No description provided.