Skip to content

Conversation

@VSadov
Copy link
Member

@VSadov VSadov commented Dec 9, 2025

Fixes: #121608
Backport of #121887

The ThreadPool in rare cases allows a scenario when an enqueued workitem is not guaranteed to be executed unless more workitems are enqueued. In some scenarios execution of a particular workitem may be necessary before more work is enqueued, thus leading to a deadlock.
This is a subtle regression introduced by a change in enqueuer/worker handshake algorithm.

The same pattern is used in 2 other ThreadPool-like internal features in addition to the ThreadPool.

Customer Impact

  • Customer reported
  • Found internally

Regression

  • Yes
  • No

Testing

Standard test pass for deterministic regressions.
A targeted stress application that demonstrates the issue.
(no fix: hangs within 1-2 minutes, with the fix: runs for 20+ min. on the same system)

Risk

Low. This is a revert to the preexisting algorithm for the enqueuer/worker handshake. (in all 3 places)

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@VSadov VSadov requested a review from stephentoub December 10, 2025 00:09
@VSadov VSadov marked this pull request as ready for review December 10, 2025 00:09
Copilot AI review requested due to automatic review settings December 10, 2025 00:09
@VSadov VSadov added the Servicing-consider Issue for next servicing release review label Dec 10, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports reliability fixes for the ThreadPool that address a subtle regression in the enqueuer/worker handshake algorithm. The regression could cause deadlocks when an enqueued work item is not guaranteed to execute unless more work items are enqueued. The fix reverts from a complex three-state machine (NotScheduled/Determining/Scheduled) to a simpler two-state flag (0/1) for worker thread coordination.

Key Changes:

  • Simplified the worker thread request mechanism from a three-state enum to a binary flag in three ThreadPool-like components
  • Updated the handshake algorithm to ensure workers clear the outstanding request flag before checking queues, preventing race conditions
  • Ensured that when a worker processes an item, it always requests another worker if more items exist, preventing deadlocks

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPoolWorkQueue.cs Reverted ThreadPoolWorkQueue and ThreadPoolTypedWorkItemQueue from three-state QueueProcessingStage enum to simple _hasOutstandingThreadRequest flag; simplified EnsureThreadRequested() and Dispatch() methods; updated worker execution logic to prevent deadlocks
src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEngine.Unix.cs Applied same algorithm revert to SocketAsyncEngine; replaced EventQueueProcessingStage enum with _hasOutstandingThreadRequest flag; updated EnsureWorkerScheduled() and Execute() methods for consistency with ThreadPool changes
Comments suppressed due to low confidence (3)

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEngine.Unix.cs:124

  • Incorrect indentation: The closing brace is indented too far. It should align with the if statement on line 119.
            }

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEngine.Unix.cs:138

  • Incorrect indentation: The opening brace should align with the public keyword on line 137.
        {

src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPoolWorkQueue.cs:1114

  • Incorrect indentation: These lines should be aligned with the else block at the same level. The closing brace on line 1114 should also align with the else keyword on line 1110.
                Unsafe.As<IThreadPoolWorkItem>(workItem).Execute();
            }

if (!_workItems.TryDequeue(out var workItem))
{
// Discount a work item here to avoid counting this queue processing work item
ThreadInt64PersistentCounter.Decrement(
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect indentation: This line should align with the comment above it, not be indented an extra level.

Suggested change
ThreadInt64PersistentCounter.Decrement(
ThreadInt64PersistentCounter.Decrement(

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Threading Servicing-consider Issue for next servicing release review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant