-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safeguard against link overflows in ConcurrentHashMap #2107
Open
instw
wants to merge
2
commits into
facebook:main
Choose a base branch
from
instw:export-D51647789
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: PROBLEM During reclamation of tagged objects, we obtain a lock within the hazard pointer domain's tagged list. This list supports always pushing objects, but only allows popping when there is a lock (for `tagged_`). There is opportunity to improve the performance of reclamation, by keeping the `tagged_` list locked for a smaller amount of time. SOLUTION The list is locked to prevent multiple threads from reclaiming the same objects. During reclamation, we take all objects that are potentially cleaned up, and divide them into 2 parts: Those that must be reclaimed, and those that must not. The ones that are not are pushed back to the `tagged_` list, and the list is unlocked. The objects to be reclaimed are freed - by pushing to the cohort. The locking of list is to guard against the `tagged_` list itself, and hence when we push the not-to-be-reclaimed elements back, we could theoretically release the lock without waiting for the reclaimed objects to finish being processed. Also validated that the `nomatch` handling is thread-safe - The cohort's `push_safe_obj` uses as CAS loop to safely add to the top of the list. Differential Revision: D51638674
Summary: PROBLEM Folly ConcurrentHashMaps use Hazard pointers to ensure map entries that were recently removed (using `erase`, `insert_or_assign`, etc) aren't cleaned up when there are readers for those objects. Instead, they are removed as part of a reclamation process which typically happens asynchronously. Moreover within ConcurrentHashMap, entries are linked to one another, and this linkage needs to be known within the hazard pointer logic to ensure we don't clean up an object that itself doesn't have any direct hazard pointers, but is referenced by another object that might have hazard pointers. That logic is within `HazptrObjLinked`. Under high contention situations (see facebook#2097 ) , the link counting logic can overflow, because a single object has too many dangling links. For example, consider a map that has 2 entries with the same hash code- `(A,0)` and `(B,0)`. Let's assume that `A` is stored before `B` internally within the `ConcurrentHashMap`'s `BucketTable`. `B` stores initially that it has a 1 link (to `A`). Now, let's assume that we replace `(A,0)` with `(A,1)`. While `(A,0)` is erased out of the `ConcurrentHashMap`, its not immediately reclaimed/deleted. During this interim, `B` has a link count of 2 to the 2 entries of `A`. This link count is stored as a 16 bit unsigned integer. If the above operation happens very quickly, then we end up in a situation where `B`'s link count overflows past 65535, and wraps around. This situation is caught in debug compilation (due to `DCHECK`), but in opt builds, it results in bad retirements. For eg, if `B`'s link count goes past 65535 to 65537 (i.e. `1`), then when 1 object of `A` is reclaimed, the `B`'s link count would decrement past `1` back to `0`, causing `B` to be incorrectly retired. Now if we actually end up removing all of `A`, the link count will overflow backwards, from `0` back to `65535` and then back to `0`, causing a double retirement - a sign to corruption. SOLUTION While the situation is rare, it can arise for skewed data with a lot of contention. There are 3 options to "solve" this: 1. Increase the link count data structure size from 16bit to something higher - Simple, but a work-around. Eventually high-enough contention would bugs to show up there as well. 2. Crash the process when there is very high contention - Maintains the current performance guarantees, and when ConcurrentHashMap cannot meet those guarantees, it causes a fatal error. 3. Slow ConcurrentHashMap erasures under high contention (this diff) - Very high contention would cause ConcurrentHashMap to slow down, and give reclamation time to act. Functionally `ConcurrentHashMap` remains the same, but does exhibit different perf characteristics. In this change, the `HazptrObjLinked` code is changed is disallow for overflows since it leads to corruption, and the callers are responsible for handling cases where links cannot be created. For `ConcurrentHashMap`, we keep waiting, until we can acquire a link : which means erasures under high contention are lock-free but not wait-free. For reclamation, there are buffers within the cohort to store both retired objects (aka `list`) and reclaimed objects (aka `safe list`). In cases where `ConcurrentHashMap` is unable to acquire a link, it's imperative it tries to initiate a reclamation cycle to make progress, and thus I added a `cleanup()` method within the cohort to flush any existing retired objects to the hazard pointer domain for retirement-evaluation, kick off a reclamation cycle, and also retire any retired objects pending within the cohort. Differential Revision: D51647789
This pull request was exported from Phabricator. Differential Revision: D51647789 |
CC @magedm - Does this look like a reasonable fix? |
I ran this against our benchmark suite in which we initially observed the bug (ParlayHash) and it does indeed appear to be fixed! Thanks for working on it. |
facebook-github-bot
pushed a commit
that referenced
this pull request
Dec 7, 2023
Summary: PROBLEM For linked objects, the ref count and link count are stored in 16 bit integers. It's easily possible to overflow either counts. SOLUTION Increase the counts to be stored using 32 bits. While this theoretically increases the footprint of the `hazptr_obj_base_linked` struct from 28 bytes to 32 bytes, the increase is small; and in many cases struct padding would anyways cause the `hazptr_obj_base_linked` to be 32 bytes. Note: This is a mitigation for #2097 . Theoretically its possible to increase contention more to cause the issue to recur, but in practice that's hard to do. See #2107 for a potential fix to the underlying issue. Reviewed By: ot Differential Revision: D51829892 fbshipit-source-id: f3ef8f7cf245dd7ff0e1ba6f6ee7bb15ead532ef
facebook-github-bot
pushed a commit
to facebook/hhvm
that referenced
this pull request
Dec 7, 2023
Summary: PROBLEM For linked objects, the ref count and link count are stored in 16 bit integers. It's easily possible to overflow either counts. SOLUTION Increase the counts to be stored using 32 bits. While this theoretically increases the footprint of the `hazptr_obj_base_linked` struct from 28 bytes to 32 bytes, the increase is small; and in many cases struct padding would anyways cause the `hazptr_obj_base_linked` to be 32 bytes. Note: This is a mitigation for facebook/folly#2097 . Theoretically its possible to increase contention more to cause the issue to recur, but in practice that's hard to do. See facebook/folly#2107 for a potential fix to the underlying issue. Reviewed By: ot Differential Revision: D51829892 fbshipit-source-id: f3ef8f7cf245dd7ff0e1ba6f6ee7bb15ead532ef
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
PROBLEM
Folly ConcurrentHashMaps use Hazard pointers to ensure map entries that were recently removed (using
erase
,insert_or_assign
, etc) aren't cleaned up when there are readers for those objects. Instead, they are removed as part of a reclamation process which typically happens asynchronously. Moreover within ConcurrentHashMap, entries are linked to one another, and this linkage needs to be known within the hazard pointer logic to ensure we don't clean up an object that itself doesn't have any direct hazard pointers, but is referenced by another object that might have hazard pointers. That logic is withinHazptrObjLinked
.Under high contention situations (see #2097 ) , the link counting logic can overflow, because a single object has too many dangling links. For example, consider a map that has 2 entries with the same hash code-
(A,0)
and(B,0)
. Let's assume thatA
is stored beforeB
internally within theConcurrentHashMap
'sBucketTable
.B
stores initially that it has a 1 link (toA
). Now, let's assume that we replace(A,0)
with(A,1)
. While(A,0)
is erased out of theConcurrentHashMap
, its not immediately reclaimed/deleted. During this interim,B
has a link count of 2 to the 2 entries ofA
. This link count is stored as a 16 bit unsigned integer. If the above operation happens very quickly, then we end up in a situation whereB
's link count overflows past 65535, and wraps around.This situation is caught in debug compilation (due to
DCHECK
), but in opt builds, it results in bad retirements. For eg, ifB
's link count goes past 65535 to 65537 (i.e.1
), then when 1 object ofA
is reclaimed, theB
's link count would decrement past1
back to0
, causingB
to be incorrectly retired. Now if we actually end up removing all ofA
, the link count will overflow backwards, from0
back to65535
and then back to0
, causing a double retirement - a sign to corruption.SOLUTION
While the situation is rare, it can arise for skewed data with a lot of contention. There are 3 options to "solve" this:
ConcurrentHashMap
remains the same, but does exhibit different perf characteristics.In this change, the
HazptrObjLinked
code is changed is disallow for overflows since it leads to corruption, and the callers are responsible for handling cases where links cannot be created. ForConcurrentHashMap
, we keep waiting, until we can acquire a link : which means erasures under high contention are lock-free but not wait-free.For reclamation, there are buffers within the cohort to store both retired objects (aka
list
) and reclaimed objects (akasafe list
). In cases whereConcurrentHashMap
is unable to acquire a link, it's imperative it tries to initiate a reclamation cycle to make progress, and thus I added acleanup()
method within the cohort to flush any existing retired objects to the hazard pointer domain for retirement-evaluation, kick off a reclamation cycle, and also retire any retired objects pending within the cohort.Differential Revision: D51647789