Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for changing the cluster (reconf) #970

Closed
neolit123 opened this issue Jul 4, 2018 · 18 comments
Closed

Tracking issue for changing the cluster (reconf) #970

neolit123 opened this issue Jul 4, 2018 · 18 comments
Labels
area/UX kind/feature Categorizes issue or PR as related to a new feature. kind/tracking-issue lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@neolit123
Copy link
Member

neolit123 commented Jul 4, 2018

This is the tracking for "change the cluster":
The feature request is to allow easy to use UX for users that want to change properties of a running cluster.

existing proposal docs:
TODO

kubeadm operator:
#1698

User story:
#1581

@neolit123 neolit123 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/upgrades labels Jul 4, 2018
@neolit123 neolit123 added this to the v1.12 milestone Jul 4, 2018
@fabriziopandini
Copy link
Member

In order to address the issue IMO kubeadm should clearly split updates (changes to the cluster configuration) from upgrades (change of release) by removing any option to change the cluster config during upgrades and creating a separated new kubeadm update/apply action.

Main rationale behind this opinion

  • the complexity of the upgrade workflow
  • the size of the test matrix for all the supported permutations of type of clusters/change of release/possible changes to the cluster configuration
  • the current test infrastructure

@neolit123
Copy link
Member Author

/kind feature
too late for 1.12, can be addressed in 1.13.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 17, 2018
@neolit123 neolit123 modified the milestones: v1.12, v1.13 Sep 17, 2018
@fabriziopandini
Copy link
Member

@neolit123 I have a KEP in flight for this

@timothysc timothysc removed this from the v1.13 milestone Oct 30, 2018
@neolit123 neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 3, 2019
@timothysc timothysc added this to the Next milestone Jan 7, 2019
@timothysc timothysc removed their assignment Jan 7, 2019
@timothysc timothysc added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 7, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 7, 2019
@ezzoueidi
Copy link

Do we still need this?

@neolit123 neolit123 removed help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels May 25, 2019
@neolit123
Copy link
Member Author

i've removed the help-wanted here.
this comment by @fabriziopandini still applies:
#970 (comment)

the way --config for apply works might have to be changed.

this overlaps with the Kustomize ideas:
#1379

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label May 25, 2019
@neolit123
Copy link
Member Author

@fabriziopandini can this be renamed as the ticket for "change the cluster"?
also link to the KEP has changed.

@brightzheng100
Copy link

What's the current status of this?

I think it's common to update some configs after kubeadm init, but I couldn't find any doc addressing this in a proper way, yet.

I personally expect something like:

kubeadm config update [flags]

or

kubeadm update [flags]

So that we can update any components' config, especially ApiServer, ControllerManager, in a streamlined way. Thank you!

@fabriziopandini
Copy link
Member

@brightzheng100
see #1698

@brightzheng100
Copy link

That solution looks complicated and please proceed to make it a complete one!

Anyway, I tried it out by simply using the kubeadm upgrade apply command and it worked after some failures and experiments.

A detailed case could be referred from here -- hope it helps.

@neolit123
Copy link
Member Author

Anyway, I tried it out by simply using the kubeadm upgrade apply command and it worked after some failures and experiments.

it's really not recommended and i'm trying to deprecate it because of the "failures" part that you mention. also its really not suited to reconfigure multi-control plane setups.

the existing workaround to modifying the cluster is:

  • modify the kubeadm-config ConfigMap with your new values.
  • modify the coredns and kube-proxy ConfigMap to match the kubeadm-config changes if needed.
  • go to each node and modify your /etc/kubernetes/manifests files.

with a proper SSH setup this is not that complicated of a bash script, but still not the best UX for new users.

@fabriziopandini
Copy link
Member

fabriziopandini commented Oct 9, 2019

@brightzheng100 thanks for your feedback. UX is a major concern and this is why we are prototyping around this proposal

@brightzheng100
Copy link

the existing workaround to modifying the cluster is:

  • modify the kubeadm-config ConfigMap with your new values.

This is not required based on my experiments.
As once we drive things by using kubeadm-config.yaml, the kubeadm-config ConfigMap will be updated accordingly.

  • modify the coredns and kube-proxy ConfigMap to match the kubeadm-config changes if needed.

I haven't found any reason yet to update these ConfigMaps manually, if we may just want to enable/disable some features in kube-apiserver.
But I really found that sometimes the coredns pods would become CrashLoopBackOff.

  • go to each node and modify your /etc/kubernetes/manifests files.

Currently I have a single-master env so haven't tried it out yet, but yup I think we have to sync up these static pods' manifests.

Frankly, building a kubeadm operator sounds a bit overkilled from my perspective (and of course I may be wrong).
Again, I'm expecting a simple command, like this:

kubeadm config update [flags]

@neolit123
Copy link
Member Author

neolit123 commented Oct 9, 2019

This is not required based on my experiments.
As once we drive things by using kubeadm-config.yaml, the kubeadm-config ConfigMap will be updated accordingly.

joining new control-planes to the cluster would still need an updated version of ClusterConfiguration.

I haven't found any reason yet to update these ConfigMaps manually, if we may just want to enable/disable some features in kube-apiserver.
But I really found that sometimes the coredns pods would become CrashLoopBackOff.

sadly, there are many reasons for the coredns pods to enter a crashloop, best way is to look in the logs. if nothing works removing the deployment and re-applying a CNI plugin too should fix it.

Currently I have a single-master env so haven't tried it out yet, but yup I think we have to sync up these static pods' manifests.

that is why kubeadm upgrade apply --config is not a good workaround for multi-control plane scenarios.

Frankly, building a kubeadm operator sounds a bit overkilled from my perspective (and of course I may be wrong).

i agree, for patching CP manifests on a single-CP you are better of just applying manual steps instead of the operator.

kubeadm config update [flags]

a similar approach was discussed, where we execute "a command" on all nodes to apply upgrade / re-config, but we went for the operator instead because that's a common pattern in k8s.

@haslersn
Copy link

haslersn commented Sep 15, 2020

For me (on Kubernetes 1.17.11) the following worked.

  1. Edit the ConfigMap with my changes:
kubectl edit configmap -n kube-system kubeadm-config
  1. On all nodes run:
sudo kubeadm upgrade apply <version>

where <version> is the current kubeadm version (in my case 1.17.11).

Additional information

  • All of my 3 nodes are control plane nodes.
  • I only changed data.ClusterConfiguration.apiServer.extraFlags.

@neolit123
Copy link
Member Author

archiving this ticket. there are no new feature requests to modify a cluster with or without an operator.
kubernetes/website#32764 is serving is purpose, but if there are additional requests i think it seems appropriate to open a new discussion topic for individual ones and potentially externalize a kubeadm operator, separate repository in k-sigs. all of these topics needs owners and currently the kubeadm team does not have the bandwidth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/UX kind/feature Categorizes issue or PR as related to a new feature. kind/tracking-issue lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

8 participants