-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support reservation #62
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! LGTM overall just a few comments
cmd/scheduler/app/options/options.go
Outdated
@@ -168,6 +168,7 @@ func (o *Options) Flags() (nfs cliflag.NamedFlagSets) { | |||
fs.Float32Var(&o.ComponentConfig.ClientConnection.QPS, "kube-api-qps", o.ComponentConfig.ClientConnection.QPS, "QPS to use while talking with kubernetes apiserver. This parameter overrides the value defined in config file, which is specified in --config.") | |||
fs.Int32Var(&o.ComponentConfig.ClientConnection.Burst, "kube-api-burst", o.ComponentConfig.ClientConnection.Burst, "burst to use while talking with kubernetes apiserver. This parameter overrides the value defined in config file, which is specified in --config.") | |||
|
|||
fs.Int64Var(&o.ComponentConfig.ReservationTimeOutSeconds, "reservation-ttl", o.ComponentConfig.ReservationTimeOutSeconds, "how long resources will be reserved (for resource reservation).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move this to #L185-L192 for more fine-grained configuration in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -4,5 +4,9 @@ name: godel-demo-default | |||
nodes: | |||
- role: control-plane | |||
image: kindest/node:v1.21.1 | |||
- role: worker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need more workers for demo cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, reservation needs two worker nodes at least, and one single worker node may cause unexpected error.
pkg/binder/utils/reservation.go
Outdated
newPrr := prr.DeepCopy() | ||
// skip pod reservation request is not in reserved status. | ||
if shouldSkipUpdateReservationStatus(prr) { | ||
klog.V(4).InfoS("skip updating pod reservation request, pod reservation request is not in reserved status", "prr", podutil.GetReservationKey(prr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: skip -> Skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pkg/binder/utils/reservation.go
Outdated
} | ||
newPrr.Status.Phase = schedulingv1a1.ReservationMatched | ||
} else { | ||
// TODO: add to count and set in unavaliable status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: no need to go on in this situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -35,12 +30,17 @@ import ( | |||
"k8s.io/client-go/tools/cache" | |||
"k8s.io/klog/v2" | |||
|
|||
nodev1alpha1 "github.com/kubewharf/godel-scheduler-api/pkg/apis/node/v1alpha1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should packages from other repos (godel-scheduler-api, katalyst-api) be located in above?
@@ -84,4 +85,8 @@ type UnitFrameworkHandle interface { | |||
FindStore(commonstore.StoreName) commonstore.Store | |||
|
|||
GetMaxWaitingDeletionDuration() time.Duration | |||
|
|||
// TODO: cleanup the following function methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move theis line to #L87 considering that the GetMaxWaitingDeletionDuration should also be cleand up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pkg/util/resource_constants.go
Outdated
import v1 "k8s.io/api/core/v1" | ||
|
||
const ( | ||
ResourceBytedanceSocket v1.ResourceName = "bytedance.com/socket" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double check that whether bytedance.com may be claimed in open source projects
890575a
to
60bfc95
Compare
af5a097
to
da3cc83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
da3cc83
to
958f98a
Compare
Please make sure that the Godel Controller Manager component has been deployed in the kind cluster along with other scheduler components as control plane components. | ||
|
||
```shell | ||
$ k get po -n godel-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubectl or alias k=kubectl
We should ensure that everyone can run these examples easily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, replace "k" to "kubectl".
godel.bytedance.com/reservation-ttl=7200 | ||
``` | ||
|
||
For example, if you configure the above annotation on a Pod or Deployment, the scheduler will automatically release the reserved resources after 24 hours if there is no consumer for the reserved resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if both of the Deployment and Pod configured this field?
Priority needs to be clearly described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some notes about the priority
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
return | ||
} | ||
|
||
fs.StringVar(&o.BindAddress, "address", o.BindAddress, "DEPRECATED: the IP address on which to listen for the --port port (set to 0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). See --bind-address instead.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to remove those deprecated flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is best to keep the behavior consistent with other scheduler components, retain the insecure server configuration, and clean it up later.
return nil | ||
} | ||
|
||
func updateDeprecatedInsecureServingOptionsFromAddress(is *apiserveroptions.DeprecatedInsecureServingOptionsWithLoopback, addr string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is unnecessary if deprecated flags are removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is best to keep the behavior consistent with other scheduler components, retain the insecure server configuration, and clean it up later.
As the opinion shows, we need keep these flags.
} | ||
|
||
// DefaultLeaderMigrationOptions returns a LeaderMigrationOptions with default values. | ||
func DefaultLeaderMigrationOptions() *LeaderMigrationOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need LeaderMigration. According to design doc https://github.com/kubernetes/enhancements/tree/master/keps/sig-cloud-provider/2436-controller-manager-leader-migration,
Support a migration process for large scale and highly available Kubernetes clusters using the in-tree cloud providers (via kube-controller-manager and kubelet) to their out-of-tree equivalents (via cloud-controller-manager).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, removed it.
} | ||
|
||
// TODO | ||
func (opt *ReservationControllerOptions) Validate() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to add some validations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No validation requirement of ReservationControllerOptions, so just delete these method.
This document uses a kind cluster as an example to introduce how to enable and use the resource reservation capability of the Godel Scheduler. | ||
|
||
## Prerequisites | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how to deploy godel controller manager, is there any documentation, or links maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the docs/features/kind-cluster-setup.md
, users can deploy godel controller manager in kind cluster.
@@ -1,3 +1,19 @@ | |||
/* | |||
Copyright 2023 The Godel Scheduler Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LICENSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. fix it.
No description provided.