Skip to content

Conversation

@rich7420
Copy link
Contributor

Description

In watcher.go, We read from ResultChan() without checking if it’s closed. If the watch ends, we may get a zero-value event and crash later (flaky tests). esource_reservation.go, We were reading from ResultChan() without an ok check and then blindly casting event.Object to *v1.Pod. If the watch closes or sends an unexpected object, the binder can panic.

Related Issues

Fixes #

Checklist

Note: Ensure your PR title follows the Conventional Commits format (e.g., feat(scheduler): add new feature)

  • Self-reviewed
  • Added/updated tests (if needed)
  • Updated documentation (if needed)

Breaking Changes

Additional Notes

@rich7420
Copy link
Contributor Author

cc @itsomri

Copy link
Collaborator

@enoodle enoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I have one question.

Comment on lines 478 to 483
if !ok || pod == nil {
logger.Error(fmt.Errorf("unexpected watch event object type"),
"Unexpected watch event object while waiting for GPU reservation pod allocation",
"nodeName", nodeName, "name", gpuReservationPodName, "eventType", event.Type, "object", fmt.Sprintf("%T", event.Object))
continue
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually happen? I think 100% not because of the watcher initialization, so we can remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh you're right. thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants