Skip to content

Conversation

@Yicheng-Lu-llll
Copy link
Member

@Yicheng-Lu-llll Yicheng-Lu-llll commented Dec 9, 2025

Description

Problem Statement

Currently, the scheduler’s node feasibility and availability checks are inconsistent with the actual resource allocation logic. The scheduler reasons only about aggregated GPU capacity per node, while the allocator(local) enforces constraints based on the per-GPU topology.

For example, consider a node with two GPUs, each with 0.2 GPU remaining. The scheduler observes 0.4 GPU available in total and concludes that a actor requesting 0.4 GPU can be placed on this node. However, the allocator(local) rejects the request because no single GPU has 0.4 GPU available.

what this PR does

The high-level goal of this PR is to make node feasibility and availability checks consistent between the scheduler and the resource allocator.

Although the detailed design is still a work in progress and need big refactor, the first step is to make the scheduler’s node feasibility and availability checks itself consistent and centralized.

Right now, Ray has three scheduling paths:

  • Normal task scheduling
  • Normal actor scheduling
  • Placement group
    • Placement Group reservation(scheduling bundle)
    • Task/Actor with Placement Group

Tasks and actors essentially share the same scheduling path and use the same node feasibility and availability check function. Placement group scheduling, however, implements its own logic in certain path, even though it is conceptually the same.

Since we may override or extend the node feasibility and availability checks in later PRs, it is better to first ensure that all scheduling paths use a single, shared implementation of this logic.

This PR addresses that problem.

Related issues

Related to #52133 #54729

Additional information

Here I list all the cases that make sure we are relying on the same node feasibility and availability checking func. Later we can just focusing on changing the func and underlying data structure:

Normal task/actor scheduling:

Placement Group reservation(scheduling bundle):

Task/Actor with Placement Group:

@Yicheng-Lu-llll Yicheng-Lu-llll requested a review from a team as a code owner December 9, 2025 00:10
// Check if the node has required labels before scoring on the resources.
const auto &label_selector = required_resources.GetLabelSelector();
if (!node_resources.HasRequiredLabels(label_selector)) {
if (!node_resources.IsAvailable(required_resources)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsAvailable already checks the required labels.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to unify node feasibility and availability checking in the scheduler. The changes in scorer.cc centralize the availability check by using node_resources.IsAvailable() and removing redundant checks within the LeastResourceScorer. This is a good refactoring that simplifies the code. My feedback includes a suggestion to add an assertion to enforce a critical precondition, which will make the code more robust.

double LeastResourceScorer::Calculate(const FixedPoint &requested,
const FixedPoint &available) {
RAY_CHECK(available >= 0) << "Available resource " << available.Double()
<< " should be nonnegative.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The removal of the requested > available check relies on node_resources.IsAvailable() being correct. To make the code more robust against potential bugs in the availability check, it's safer to add a RAY_CHECK(requested <= available). This enforces the assumption that Calculate is only called for feasible requests and will help catch issues early.

Suggested change
<< " should be nonnegative.";
<< " should be nonnegative.";
RAY_CHECK(requested <= available)
<< "Requested resource " << requested.Double()
<< " should not be greater than available resource " << available.Double()
<< ". This indicates a bug in the feasibility check.";

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. From a design perspective, the logic for determining whether a node is schedulable and the logic for choosing the best node should be completely separate and follow a strict order.

@MengjinYan MengjinYan self-assigned this Dec 9, 2025
@ray-gardener ray-gardener bot added the core Issues that should be addressed in Ray Core label Dec 9, 2025
@Yicheng-Lu-llll Yicheng-Lu-llll added the go add ONLY when ready to merge, run all tests label Dec 9, 2025
@edoakes
Copy link
Collaborator

edoakes commented Dec 9, 2025

@ZacAttack @Sparks0219 PTAL too

return node_score;
}

// This function assumes the resource request has already passed the availability check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add the this also to the comment of the function in the .h file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, Thank you!

return -1.;
}
node_score += score;
node_score += Calculate(request_resource, node_available_resource);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding of the code:

  • Previously: we first checks the labels, then potentially update the available resources, the checks whether there are available resources in the Calculate function
  • Now: we first checks both the labels and the the available resources, then potentially update the available resources, the calculate the score assuming the node has enough resources for the request

I might miss something but looks like with the new implementation, there could be case where the node will have available resources before the resource update but not after which will make the new implementation not consistent with the previous implementation, is that right?

Copy link
Member Author

@Yicheng-Lu-llll Yicheng-Lu-llll Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that — I should have mentioned this!

IsAvailable already handles the normal_task_resources subtraction internally:

if (!this->normal_task_resources.IsEmpty()) {
auto available_resources = this->available;
available_resources -= this->normal_task_resources;
return available_resources >= resource_request.GetResourceSet();
}
return this->available >= resource_request.GetResourceSet();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I missed that. Thanks for the context!

The current implementation is a bit confusion, as part of the project of improving the resource tracking effort, we should make it cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will look deep into it and draft an follow up pr if needed!

Signed-off-by: yicheng <[email protected]>
Copy link
Contributor

@MengjinYan MengjinYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants