[perf] WASM hashing function #3241

romgrk · 2024-08-21T04:54:35Z

I work at MUI and I've been doing some performance optimization. I've noticed that murmur2 is at the top of the stack traces for the test case benchmark I'm using, so I wanted to open a PR here to discuss options to improve the performance.

I've extracted some actual style strings that are hashed at runtime by emotion (from the test case above) and created a benchmark to compare emotion's JS murmur2 implementation to alternatives. I used rust+wasm to implement those alternatives, because the use-case is a good match for the performance benefits of wasm. Here is a summary of different versions compared to the current one

Passing strings from JS to WASM implies encoding them in an array buffer, so the performance scales differently on the input string length. The original JS version performs about 25% faster than the rest for small strings (up until a length of around 64 characters), but after that the WASM versions take the lead. The table below shows the results for the strings extracted at runtime described above which is imho close enough to what happens in production for real-word users.

version	ops/s	result
`murmur2_original`	1 259 ops/s, ±0.67%	74.38% slower
`murmur2_rust()`	3 635 ops/s, ±0.60%	26.04% slower
`xxh_rust()`	4 915 ops/s, ±0.44%	fastest
`xxh3_rust()`	4 095 ops/s, ±0.55%	16.68% slower
`city_hash_rust()`	4 109 ops/s, ±0.68%	16.4% slower
`adaptive()`	4 102 ops/s, ±0.77%	16.54% slower

The fastest option is xxHash, which is newer and faster than murmur2, with a similar collision rate. adaptive is a combination of using murmur2_original for small strings and xxh_rust for large strings, but the additional branching instruction (if (string.length > 64) ...) makes it slower.

Using a WASM module also doesn't need to add any build/bundle time complexity for users, as we can ship the module inline, something like this. The compiled wasm module is around 500 bytes, so a realistic cost for the wrapper and base64 bytes would be around 800 bytes.

const wasm = WebAssembly.Module(atob('/* base64 encoded bytes */'))

I have also benchmarked the MUI test case again after replacing the hashing function. The results aren't as impressive as I expected (I was hoping for a 70% reduction of hashing time based on the table above), but this change improves MUI's rendering performance by about 2-4%.

before	after

The code in this PR is a draft but I'd be happy to have an early review/opinion before doing more work.

changeset-bot · 2024-08-21T04:54:38Z

⚠️ No Changeset found

Latest commit: 84220c3

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

codesandbox-ci · 2024-08-21T04:56:23Z

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Andarist · 2024-08-21T06:48:23Z

Using a WASM module also doesn't need to add any build/bundle time complexity for users, as we can ship the module inline, something like this. The compiled wasm module is around 500 bytes, so a realistic cost for the wrapper and base64 bytes would be around 800 bytes.

800 bytes is still a sizeable % of the overall bundlesize IIRC. I also don't think we can assume that web assembly is available on the host platform so we'd have to include both versions to avoid breaking changes. cc @emmatown

romgrk · 2024-08-21T07:33:52Z

For more precise numbers, XXH is 519 bytes of binary, and murmur2 is 386 bytes of binary, so total 800 and 650 after wrapping & base64 is a good figure.

romgrk · 2024-08-21T08:43:19Z

I have improvements for the JS implementation but I have split them in #3242.

Andarist · 2024-08-21T09:10:32Z

How much faster is this WASM version than the improved JS function you prepared? Also, what would be the numbers for JS version of xxHash?

romgrk · 2024-08-21T09:48:53Z

The performance numbers are all in the top comment of that PR, but roughly 2x faster than the improved javascript implementation, which is itself roughly 2x faster than the current one (though TextEncoder is the most crucial part of that improvement).

I don't know for a JS xxHash, the only existing port I found, https://github.com/pierrec/js-xxhash, is written in an OOP style, so that won't fly for a performance improvement. I took a look at the source code but it's much more dense than murmur2 so I haven't tried porting it yet.

On the same topic, would you be open to switch the algorithm if it increases the collision probability (reasonably)? I've been wondering if I could find a simpler algo that maintains a good enough collision rate. I'm not familiar with your codebase, how do you handle collisions?

Andarist · 2024-08-21T10:15:47Z

Before considering the WASM version, I'd love to see what could be squeezed out of it using a performant JS port of xxHash.

On the same topic, would you be open to switch the algorithm if it increases the collision probability (reasonably)?

Potentially yes.

I'm not familiar with your codebase, how do you handle collisions?

We pray they won't happen.

romgrk · 2024-08-21T16:14:49Z

I've found a performant port of xxh but it can't beat murmur2 in javascript:

Running "Benchmark hash, string length = 1245" suite...

  murmur2_original():
    1 258 ops/s, ±0.53%   | slowest, 74.04% slower
  murmur2_improved():
    2 770 ops/s, ±0.68%   | 42.84% slower
  xxh_javascript():
    2 634 ops/s, ±0.64%   | 45.65% slower
  murmur2_rust():
    3 640 ops/s, ±0.75%   | 24.89% slower
  xxh_rust():
    4 846 ops/s, ±0.85%   | fastest

Having looked at the implementation for a few hashing functions, I think it's going to be hard to find anything faster than murmur2 in JS. The reason why xxh can be faster than murmur2 in WASM is that its algorithm is easy to optimize & auto-vectorize, so it can use better instructions than murmur2 can, in particular SIMD instructions that can only be accessed via WASM in the browser.

romgrk · 2024-08-23T01:51:53Z

Follow-up of @emmatown comment in the other PR, the algo needs to be stable in case of server-rendering hydration, which means either murmur2 or xxh in both versions. You have all the numbers here, please let me know if/how you want to proceed.

Both algos are neck and neck in JS, but xxh is substantially faster in WASM, so I would recommend switching to that one.

codecov · 2024-08-23T02:14:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.83%. Comparing base (f4640f6) to head (230f8eb).
Report is 67 commits behind head on main.

Additional details and impacted files

see 44 files with indirect coverage changes

oliviertassinari · 2024-08-25T14:52:35Z

Goober seems to use a really simple hash algorithm: https://github.com/cristianbote/goober/blob/master/src/core/to-hash.js, and seems to be really fast on the server: https://twind.dev/handbook/introduction.html#opportunities

@cristianbote has been iterating on it for some time. But maybe it has a too high collision probability.

romgrk · 2024-08-25T16:45:18Z

I have been spending way too much time researching hashing functions so here is some data.

Performance

First, here are the performance results across engines. The _unmodified variants are copy/pasted (from emotion and goober), and the other ones are ported/modified by me; those ones all go through TextEncoder which is imo best for performance most of the time (but not always, as some SpiderMonkey entries indicate). This uses the MUI dashboard dataset.

So what I said above:

I've found a performant port of xxh but it can't beat murmur2 in javascript:

Is false, and xxh can beat murmur2 in javascript depending on the dataset. Anyway this still supports my previous recommendation to switch to xxh. The results on SpiderMonkey are however puzzling to me, but tbh the engine seems to have weird performance cliffs in many situations, I've given up on trying to understand why.

Collision rate

I have plotted the distribution of hashes for the "Numbers" dataset as described in this stackoverflow answer, for the different functions:

DJB2a	goober	fnv1-a	murmur2	xxh

Nothing to add here, the plots speak for themselves.

So summary is, goober not good for us.

asm.js annotations

@cristianbote Quick win for goober though, you can add asm.js annotations to speed it up, JSC seems to need those:

// original
while (i < str.length) out = (101 * out + str.charCodeAt(i++)) >>> 0;

// asm.js
while (i < str.length) out = (Math.imul(101, out) + str.charCodeAt(i++) | 0) >>> 0;

Links

All benchmarking code here: https://github.com/romgrk/js-benchmark

# benchmarks
node ./index.js ./benchmarks/hashing.js
# plots
node ./hash-testing.js

romgrk · 2024-08-25T17:26:08Z

Funny thought I had but I took the fast hex formatting function from mui/material-ui#43289 and replaced the final n.toString(36) by formatHex(n), substantial performance improvement as well. So @Andarist @emmatown if you're willing to replace your classnames from css-[base-36] to css-[base-16], this can also speed up the hashing.

cristianbote · 2024-08-26T06:57:08Z

wow, nice! Love these insights @romgrk

Took them for a spin locally on my end and I can confirm your results. Don't wanna side-track the conversation here about goober 😅 so I've opened up this PR to continue the discussion there if you'd like. cristianbote/goober#588

Andarist · 2024-08-26T08:15:09Z

FYI, I'm on PTO this week. I can continue the conversation when I get back

romgrk added 4 commits August 20, 2024 22:17

feat: add rust implementation

fe0980c

lint

c882f33

update

bd8672e

lint

b97aeda

romgrk added 3 commits August 21, 2024 00:56

lint

a04e204

lint

4785987

lint

34ed2e8

romgrk mentioned this pull request Aug 21, 2024

[perf] hashing function improvements #3242

Open

romgrk added 2 commits August 22, 2024 22:10

lint

09cbf8b

lint

230f8eb

lint

84220c3

cristianbote mentioned this pull request Aug 26, 2024

Explore xxh algo and benchmark cristianbote/goober#588

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[perf] WASM hashing function #3241

[perf] WASM hashing function #3241

romgrk commented Aug 21, 2024 •

edited

Loading

changeset-bot bot commented Aug 21, 2024 •

edited

Loading

codesandbox-ci bot commented Aug 21, 2024 •

edited

Loading

Andarist commented Aug 21, 2024

romgrk commented Aug 21, 2024

romgrk commented Aug 21, 2024

Andarist commented Aug 21, 2024 •

edited

Loading

romgrk commented Aug 21, 2024

Andarist commented Aug 21, 2024

romgrk commented Aug 21, 2024

romgrk commented Aug 23, 2024 •

edited

Loading

codecov bot commented Aug 23, 2024

oliviertassinari commented Aug 25, 2024

romgrk commented Aug 25, 2024

romgrk commented Aug 25, 2024 •

edited

Loading

cristianbote commented Aug 26, 2024

Andarist commented Aug 26, 2024

[perf] WASM hashing function #3241

Are you sure you want to change the base?

[perf] WASM hashing function #3241

Conversation

romgrk commented Aug 21, 2024 • edited Loading

changeset-bot bot commented Aug 21, 2024 • edited Loading

⚠️ No Changeset found

codesandbox-ci bot commented Aug 21, 2024 • edited Loading

Andarist commented Aug 21, 2024

romgrk commented Aug 21, 2024

romgrk commented Aug 21, 2024

Andarist commented Aug 21, 2024 • edited Loading

romgrk commented Aug 21, 2024

Andarist commented Aug 21, 2024

romgrk commented Aug 21, 2024

romgrk commented Aug 23, 2024 • edited Loading

codecov bot commented Aug 23, 2024

Codecov Report

oliviertassinari commented Aug 25, 2024

romgrk commented Aug 25, 2024

Performance

Collision rate

asm.js annotations

Links

romgrk commented Aug 25, 2024 • edited Loading

cristianbote commented Aug 26, 2024

Andarist commented Aug 26, 2024

romgrk commented Aug 21, 2024 •

edited

Loading

changeset-bot bot commented Aug 21, 2024 •

edited

Loading

codesandbox-ci bot commented Aug 21, 2024 •

edited

Loading

Andarist commented Aug 21, 2024 •

edited

Loading

romgrk commented Aug 23, 2024 •

edited

Loading

romgrk commented Aug 25, 2024 •

edited

Loading