Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log detail information when hit insufficient resource #3920

Open
7sunarni opened this issue Dec 24, 2024 · 2 comments
Open

Log detail information when hit insufficient resource #3920

7sunarni opened this issue Dec 24, 2024 · 2 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@7sunarni
Copy link
Contributor

What is the problem you're trying to solve

Sometimes if the cluster don't have enough resource for pod to schedule, and the Pod condition will have this message

...
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: null
    message: '0/3 nodes are unavailable: 1 plugin NodeAffinity predicates failed node(s)
      didn''t match Pod''s node affinity/selector, 2 Insufficient huawei.com/Ascend910.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
...

And scheduler will log some information like below

...
Predicates failed for task <default/POD_NAME> on node <NODE_NAME>: task default/POD_NAME on node NODE_NAME fit failed: Insufficient huawei.com/Ascend910
...

And we need to get pod resource limit and request, node allocate resource to find how much resource pod asked and how node left.

I think we can add some information to log or pod status, so we can quickly find the reason. e.g. change the log like this.

...
Predicates failed for task <default/POD_NAME> on node <NODE_NAME>: task default/POD_NAME on node NODE_NAME fit failed: Insufficient huawei.com/Ascend910, `which request 2 and node has 1`.
...

Describe the solution you'd like

Add more information to log

Additional context

No response

@7sunarni 7sunarni added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 24, 2024
@lowang-bh
Copy link
Member

You can check it in log.
Those detail information is changed dynamic and cannot be put in status.

@JesseStutler
Copy link
Member

You can check this log:

klog.V(4).Infof("Considering Task <%v/%v> on node <%v>: <%v> vs. <%v>",
task.Namespace, task.Name, node.Name, task.Resreq, node.Idle)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants