Header name to lower case conversion and validation performance optimization #804

fereidani · 2025-11-27T14:49:23Z

This PR includes two commits:

Improves integer conversion performance in HeaderValue: it eliminates an extra heap allocation by using itoa's stack-allocated buffer instead.
Adds WordRegister for efficient word-sized byte operations: this introduces chunked processing and validation for lowercase conversion and header name validation. It uses several tricks to reduce instruction count and enable batch processing, validating an entire chunk with just 3 assembly instructions instead of processing byte-by-byte with branching in every loop iteration.

Please review the unsafe parts again, and it would be great to test this on a big-endian CPU as well if one is available.

I wrote a benchmark for these changes: https://github.com/fereidani/headernamebench

It’s debatable whether this change actually benefits 32-bit systems; we can disable it by checking the pointer size constant if needed, which skips compilation of optimization for those targets.

Here are my results for this benchmark, showing roughly 50% performance improvement on typical workloads and only a negligible slowdown for very small headers (like Host) when the optimization does not apply:

header_to_lower_vs_optimized/header_to_lower_valid
                        time:   [1.3416 µs 1.3446 µs 1.3483 µs]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
header_to_lower_vs_optimized/header_to_lower_optimized_valid
                        time:   [717.29 ns 718.05 ns 718.81 ns]
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) low severe
  5 (5.00%) low mild
  1 (1.00%) high mild
  5 (5.00%) high severe
header_to_lower_vs_optimized/header_to_lower_invalid
                        time:   [575.18 ns 579.05 ns 584.14 ns]
header_to_lower_vs_optimized/header_to_lower_optimized_invalid
                        time:   [254.98 ns 255.65 ns 256.38 ns]
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe
header_to_lower_vs_optimized/header_to_lower_host
                        time:   [28.722 ns 28.789 ns 28.856 ns]
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low severe
  6 (6.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe
header_to_lower_vs_optimized/header_to_lower_optimized_host
                        time:   [29.522 ns 29.600 ns 29.672 ns]

fereidani added 2 commits November 26, 2025 18:35

refactor(header): improve integer conversion performance in HeaderValue

ec11180

perf: add WordRegister for efficient word-sized byte operations

70a0157

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Header name to lower case conversion and validation performance optimization #804

Header name to lower case conversion and validation performance optimization #804

fereidani commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Header name to lower case conversion and validation performance optimization #804

Are you sure you want to change the base?

Header name to lower case conversion and validation performance optimization #804

Conversation

fereidani commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant