Create an arena for package names #242

Eh2406 · 2024-07-22T15:39:59Z

We have long known that resolution time is proportional to P::Clone. Most real-world resolution problems contain at least one String in there P. If performance is anywhere on the priority list then P::Clone should not allocate. Thanks to our library being generic this is easy for users to achieve and control. P can be a wrapper around Rc<P> and it does not allocate, or using various interning strategies use &str for even faster clone, or by using hashconsing (or any de-duplicating interning strategy) P can be a wrapper around usize. So for our benchmarks we used P=u32 because it was the simplest type that met our trait pounds OR P=&str because it was similar to what was easily available to a user. Now that we have production users, we see that there P tends to have a more expensive clone. (Rc for the cargo benchmarks, Arc for the uv use case, String for gleam and elm.)

It also turns out that resolution time is proportional to P::Hash. As P ends up being the key in a large number of hash tables. The hottest of these tables tends to be PartialSolution::package_assignments. Only the hashconsing (or its relatives) strategy allows a user to provide very fast P::Hash where P has a string in it. None of our production users are using this approach. So the performance of our internal benchmarks that use &str are far more realistic than the ones that use u32.

Luckily we can provide a hashconsing like wrapper around P for our users. We already have an arena such that if two IDs are equal then the data they point at must be equal, we just need to make the inverse true that if the data is equal then any IDs for that data will be equal. This arena instead of allocating by just pushing new items to a Vec (and returning the new index) would return a previous ID if the value had already been added and do the normal thing otherwise. This does not involve a lot of code by using the indexmap crate, which is already one of our dependencies.

This is a pessimization four benchmarks that use u32, or for any (theoretical) users who are already using hashconsing. Unfortunately rust does not have specialization, and even if it did there is no trait bound for "Hash is cheap". Based on the (hilariously out of date named) large_case_u16_NumberVersion this is a ~9.5% regression, similarly ~10.5% for the synthetic slow_135_0_u16_NumberVersion. A sudoku problem, which is not synthetic but also not the intended use case, sees a similar regression.

Real-world benchmarks see significant improvements. elm_str_SemanticVersion 10.2%, zuse_str_SemanticVersion 19.9%, all of crates.io 14% without lock files and 11% with.

I'd love to hear the impact on uv benchmarks, but I expect them to be in a similar range. Perhaps infinitesimally bigger because they are already collecting this data for their implementation of prioritize which can now just be Id<P>::as_raw().

There is potential follow-up work because Map<Id<P (especially when densely filled) can be replaced with a Vec decreasing the size and removing the calculation of Hash for Id. But it's a lot of code for a much smaller when so I left it out for now.

Unfortunately bunch of log and panic messages now refer to Id(1) instead of the name of that package. Similarly they use Debug instead of display. This is hard to fix because the relevant impls do not have access to the arena in which the actual package name is stored.

charliermarsh · 2024-07-22T15:50:02Z

Awesome idea!

charliermarsh · 2024-07-22T15:50:13Z

We'll try it and report back.

charliermarsh · 2024-07-23T02:34:40Z

My impression is that it gives us a small but consistent speedup, e.g., on a large resolution with a filled cache (so minimal IO):

❯ hyperfine "../target/release/main lock" "../target/release/uv lock" --warmup 100 --runs 100
Benchmark 1: ../target/release/main lock
  Time (mean ± σ):      27.3 ms ±   0.5 ms    [User: 21.3 ms, System: 5.2 ms]
  Range (min … max):    26.5 ms …  30.6 ms    100 runs

Benchmark 2: ../target/release/uv lock
  Time (mean ± σ):      26.7 ms ±   0.3 ms    [User: 20.8 ms, System: 5.1 ms]
  Range (min … max):    26.1 ms …  27.3 ms    100 runs

Summary
  '../target/release/uv lock' ran
    1.02 ± 0.02 times faster than '../target/release/main lock'

Or, with a pre-filled cache but no existing lockfile:

❯ hyperfine "../target/release/main lock" "../target/release/uv lock" --warmup 100 --runs 100 --prepare "rm uv.lock"
Benchmark 1: ../target/release/main lock
  Time (mean ± σ):     157.9 ms ±   4.7 ms    [User: 175.9 ms, System: 176.6 ms]
  Range (min … max):   147.0 ms … 174.1 ms    100 runs

Benchmark 2: ../target/release/uv lock
  Time (mean ± σ):     155.1 ms ±   4.3 ms    [User: 169.5 ms, System: 169.9 ms]
  Range (min … max):   146.6 ms … 169.9 ms    100 runs

Summary
  '../target/release/uv lock' ran
    1.02 ± 0.04 times faster than '../target/release/main lock'

charliermarsh · 2024-07-23T02:38:34Z

I can probably do a bit better with more work on our side to use the IDs everywhere.

Eh2406 · 2024-07-23T14:16:51Z

10% on only resolution micro benchmarks leading to 1% on a end-to-end test makes sense if we are spending about 10% inside resolution code.

charliermarsh · 2024-07-23T17:45:28Z

Yeah, seems like a clear improvement.

codspeed-hq · 2024-11-19T19:02:36Z

CodSpeed Performance Report

Merging #242 will degrade performances by 3.5%

_{Comparing Eh2406:pid_old (824ee77) with dev (4bde6d6)}

Summary

⚡ 3 improvements
❌ 1 regressions
✅ 2 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`dev`	`Eh2406:pid_old`	Change
⚡	`backtracking_singletons`	4.3 s	4.2 s	+4.21%
❌	`large_case_u16_NumberVersion.ron`	20.2 ms	21 ms	-3.5%
⚡	`sudoku-easy`	15.9 ms	15.3 ms	+4.32%
⚡	`sudoku-hard`	16.3 ms	15.5 ms	+5.34%

src/internal/arena.rs

konstin

(the perf numbers in uv shouldn't have changed)

x-hgg-x · 2024-11-20T14:42:29Z

Here are the main differences between #274 and this PR:

I used a PackageId type instead of Id<P> to remove the generic type.
I added the package store inside the error type to avoid cloning packages in the derivation tree (breaking change).
I added a display() method for Incompatibility, External and PartialSolution using the package store to get better logs and reports.
I added the package store in the DP methods to avoid hashing packages several times (breaking change).

src/internal/core.rs

Eh2406 · 2024-11-20T20:39:12Z

The display() method is a very good idea. I copied that for Incompatibility, and PartialSolution. Thank you. So far External still holds P so does not need special handling. I audited every place where this PR switch from display to debug, where possible I included both. With the mapping regularly enough reported in the logs, I'm not worried about the remaining cases.

I'm going to merge, we can discuss breaking changes in follow-up PR's.

charliermarsh · 2024-11-26T17:52:50Z

I'll upgrade uv to use this behavior.

Eh2406 force-pushed the pid_old branch from e8c2991 to 110c17d Compare July 22, 2024 15:55

charliermarsh mentioned this pull request Jul 23, 2024

Create an arena for package names astral-sh/pubgrub#29

Draft

Eh2406 mentioned this pull request Oct 22, 2024

Investigate why Solana is so slow. Eh2406/pubgrub-crates-benchmark#6

Open

konstin mentioned this pull request Nov 8, 2024

Improve performance #274

Open

Eh2406 force-pushed the pid_old branch from 110c17d to 59fb38b Compare November 19, 2024 18:55

x-hgg-x reviewed Nov 20, 2024

View reviewed changes

src/internal/arena.rs Outdated Show resolved Hide resolved

konstin approved these changes Nov 20, 2024

View reviewed changes

x-hgg-x reviewed Nov 20, 2024

View reviewed changes

src/internal/core.rs Outdated Show resolved Hide resolved

Eh2406 force-pushed the pid_old branch 2 times, most recently from 1991777 to e44080d Compare November 20, 2024 19:24

x-hgg-x mentioned this pull request Nov 20, 2024

Add experimental u64 version bitset #284

Open

Eh2406 force-pushed the pid_old branch 2 times, most recently from 277c0da to f68ec2d Compare November 20, 2024 20:20

Create an arena for package names

824ee77

Eh2406 force-pushed the pid_old branch from f68ec2d to 824ee77 Compare November 20, 2024 20:30

Eh2406 added this pull request to the merge queue Nov 20, 2024

Merged via the queue into pubgrub-rs:dev with commit 4ac6c42 Nov 20, 2024
5 of 6 checks passed

Eh2406 deleted the pid_old branch November 20, 2024 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an arena for package names #242

Create an arena for package names #242

Eh2406 commented Jul 22, 2024

charliermarsh commented Jul 22, 2024

charliermarsh commented Jul 22, 2024

charliermarsh commented Jul 23, 2024

charliermarsh commented Jul 23, 2024

Eh2406 commented Jul 23, 2024

charliermarsh commented Jul 23, 2024

codspeed-hq bot commented Nov 19, 2024 •

edited

Loading

konstin left a comment

x-hgg-x commented Nov 20, 2024 •

edited

Loading

Eh2406 commented Nov 20, 2024

charliermarsh commented Nov 26, 2024

Create an arena for package names #242

Create an arena for package names #242

Conversation

Eh2406 commented Jul 22, 2024

charliermarsh commented Jul 22, 2024

charliermarsh commented Jul 22, 2024

charliermarsh commented Jul 23, 2024

charliermarsh commented Jul 23, 2024

Eh2406 commented Jul 23, 2024

charliermarsh commented Jul 23, 2024

codspeed-hq bot commented Nov 19, 2024 • edited Loading

CodSpeed Performance Report

Merging #242 will degrade performances by 3.5%

Summary

Benchmarks breakdown

konstin left a comment

Choose a reason for hiding this comment

x-hgg-x commented Nov 20, 2024 • edited Loading

Eh2406 commented Nov 20, 2024

charliermarsh commented Nov 26, 2024

codspeed-hq bot commented Nov 19, 2024 •

edited

Loading

x-hgg-x commented Nov 20, 2024 •

edited

Loading