Skip to content

Conversation

@gfx
Copy link

@gfx gfx commented Jan 31, 2026

This PR adds support for the WebAssembly branch hinting proposal by parsing the metadata.code.branch_hint custom section and using the hints to mark cold blocks during Cranelift code generation.

Implementation details:

  • Parse branch hints in ModuleTranslation from the custom section using wasmparser's KnownCustom::BranchHints reader
  • Store hints as a map from function index to (offset, taken) pairs
  • Add get_branch_hint() helper in FuncEnvironment to look up hints by converting absolute byte offsets to function-relative offsets
  • Apply hints in translate_br_if(): when a branch is marked unlikely (taken=false), mark the branch target as cold; when marked likely (taken=true), mark the fallthrough block as cold

Branch hints are always parsed and applied when present in a module; no configuration flag is required.

A disassembly test is included to verify cold block annotations, but official WebAssembly spec tests for this proposal are not yet available.

Closes #9463

Question: should I add a knob to enable/disable this feature? Currently, this is automatically turned on. I'm not sure I should add an option for it.

This commit adds support for the WebAssembly branch hinting proposal by
parsing the `metadata.code.branch_hint` custom section and using the
hints to mark cold blocks during Cranelift code generation.

Implementation details:

- Parse branch hints in `ModuleTranslation` from the custom section using
  wasmparser's `KnownCustom::BranchHints` reader
- Store hints as a map from function index to (offset, taken) pairs
- Add `get_branch_hint()` helper in `FuncEnvironment` to look up hints
  by converting absolute byte offsets to function-relative offsets
- Apply hints in `translate_br_if()`: when a branch is marked unlikely
  (`taken=false`), mark the branch target as cold; when marked likely
  (`taken=true`), mark the fallthrough block as cold

Branch hints are always parsed and applied when present in a module;
no configuration flag is required.

A disassembly test is included to verify cold block annotations, but
official WebAssembly spec tests for this proposal are not yet available.

Closes bytecodealliance#9463

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@gfx gfx requested review from a team as code owners January 31, 2026 09:44
@gfx gfx requested review from fitzgen and removed request for a team January 31, 2026 09:44
@github-actions github-actions bot added the wasmtime:docs Issues related to Wasmtime's documentation label Jan 31, 2026
Comment on lines 1229 to 1237
self.translation
.branch_hints
.get(&func_index.as_u32())
.and_then(|hints| {
hints
.iter()
.find(|(o, _)| *o as usize == relative_offset)
.map(|(_, taken)| *taken)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop I think might be good to optimize in terms of translation will only lookup branch hints (I think?) in increasing order of offsets. That means that currently this code is an $O(n^2)$ loop as-implemented. That could be optimized by using a binary search here, but I think this could go one step further and, ideally, store a reference to the raw buffer of input wasm data (in theory) here. For example this could be a .peekable() iterator over the raw wasm itself. Looking up via get_branch_hint would advance the iterator if it's at the matching position.

Regardless I think it'll be needed to remove the quadratic behavior here, and I think ideally it'd be via an iterator-like approach to avoid the logarithmic nature of a binary search.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are right.

How about bc62088 for O(n) complexity?

Copy link
Member

@cfallin cfallin Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Alex is getting at, I think, is that the spec states that the hints are given in PC order. So why not traverse hints in order, taking them when they match as we visit increasing PC offsets? We should be able to that in O(n) time with O(1) space overhead, i.e., not building a HashMap at all.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: should I add a knob to enable/disable this feature? Currently, this is automatically turned on. I'm not sure I should add an option for it.

Yes, I think we should have a config knob for this.

And to follow our rules by the letter, we we shouldn't enable it by default until it is fuzzed (and has been fuzzed for a couple weeks). This would most likely involve adding support to wasm-smith for emitting branch hints.

Change branch hints storage from Vec<(u32, bool)> to HashMap<u32, bool>
for O(1) lookup instead of O(n) linear search per lookup.

The previous implementation had O(n²) complexity when processing many
branch hints since each lookup scanned the entire vector. The HashMap
approach also handles untrusted input robustly without assumptions
about hint ordering in the custom section.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:docs Issues related to Wasmtime's documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement the Wasm branch-hinting proposal

4 participants