Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt_merge: hashing performance and correctness #4677

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

widlarizer
Copy link
Collaborator

This is a direct remake of #4175 sans the 64-bit hash value. I'm making use of the interface in #4524 and requiring that PR (and containing its commits at the moment). Instead of xorshifts, values are sorted, though a final xorshift is included as a part of the fudge (--hash-seed=N) mechanism.

Additionally, I discovered opt_merge behaves incorrectly in the case of hash collisions. This suggests that this PR might in rare cases bring improvements in quality of results for flows that use opt_merge, since prior to it, hash collisions would inhibit merging. I modified the sharemap from a dict<hash_t, Cell*> to an equivalent std::unordered_multimap so that multiple cells can be associated with the same hash. This can't be a separate change since this bug actually broke the build just by changing how the hashes are constructed.

Sorry for the spam to code owners due to being based on the above mentioned wide-reaching PR #4524, I don't have a way of removing you from the reviewer list. The diff for this PR is not going to be very readable on github either until that's merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant