-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.15 - kubeadm join --control-plane configures kubelet to connect to wrong apiserver #1955
Comments
/assign although,
i was testing something unrelated and doing the above and the 2 other master did not become NotReady. this could be an artifact of your older cluster upgrade. in any case you might have to apply your manual fix, as we cannot backport this to < 1.18 releases as it does not match the k8s backport criteria. |
Thanks! I've done some tests with a completely new 1.15 cluster, following the instructions from https://v1-15.docs.kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ to create the initial master with
All references point to the load balancer, on all masters, and also the workers. Including the bootstrap-config. I joined one master against the load balancer, and the other against the initial master. All the workers were joined against the initial master. So there must be some state that kubeadm is reading from the initial master to get the address of the load balancer, but it can't be the |
That's OK. What is the criteria for backports? |
I've done a couple of more experiments with the first cluster.
|
I managed to figure out where it got the kubelet address from.
I changed the configmap to point to the load balancer address before joining a new master, and this is the result:
It would be nice if kubeadm would handle all the little details when migrating from single to multi master setup. But maybe it's a new feature instead of a bug. |
yes, this was my assumption.
critical, blocking bugs without known workarounds - go panics, security bugs etc.
i would like more eyes on this problem and we might be able to backport a fix, but no promises. @ereslibre PTAL too. |
Nice! |
Oh wow, thanks for this report @blurpy. I'm going to check if I can reproduce this issue. Thanks for the heads up @neolit123. /assign |
studied the code for a bit, I think it's because:
that being said, the current workaround is to:
after controlPlaneEndpoint has been changed. |
I don't think I'm not familiar with the code, but I think we can obtain a somewhat more precise view of the current cluster configuration from the willing to draft a PR on this if the above suggestion made any sense. @ereslibre @neolit123 |
changing the controlPlaneEndpoint is not really something that kubeadm supports. it also means that users need to re-sign certificates. so cluster-info seems like a valid source of truth for the server address as long as the user knows how to setup their cluster long term properly. using an FQDN for the controlPlaneEndpoint however is something that is encouraged even for single-control-plane scenarios. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#considerations-about-apiserver-advertise-address-and-controlplaneendpoint
we should have a better discussion before such a PR. |
and yet this seems very needed, if one fails to notice this and uses the initial master ip address as controlPlaneEndpoint, there'll be no way to migrate to a HA setup afterwards, not in the workflows described in the official documents. community had already hack their way to manage this, although with problems here and there. but I think the tutorial mentioned above, together with the workarounds addressed in this issue, already form a viable solution to migrating controlPlaneEndpoint in a cluster. |
with Kubernetes there are at least 100 ways for the cluster operator to shoot themself in the foot...long term even.
we might provide a way to "change the cluster" like that using the kubeadm operator, but the work is still experimental: |
consider adding a check somewhere in the |
i'm personally not in favor of adding such a warning, given we have this in the single-cp guide, but others should comment too. |
As far as I can see, mention of CPE was added to the docs for 1.16, so I'm guessing there are a lot more clusters out there missing this. There's also the case where you have to change the endpoint address for some reason.
Would it be a lot of work to add a phase for kubeadm to change the CPE (for the shorter term), based on the value in the kubeadm-config-file? It seems like a useful function to have in kubeadm. And then there's no need for all the warnings, because moving to HA is a supported strategy even if you never thought you needed it when you started off. |
that is why one should use a domain name.
moving to HA is only supported if one was using the control-plane endpoint. if they have not used control-plane endpoint moving to HA is difficult and i don' think we should add a phase for it. a "phase" in kubeadm terms is something that is part of the standard workflow. |
It's what I was thinking about. Domain names are not always forever.
It is definitely difficult, which is why having help from kubeadm would be very helpful. Do you think the steps described in this issue are not safe for production use?
Could moving between non-HA and HA be a standard workflow? |
not for the init or join phases. but it might make more sense to have this as a guide in the docs without introducing new commands. |
getting back to this.
it's the responsibility of the operator to prepare the right infrastructure and guarantee connectivity in the cluster. as can be seen in the discussion in #338 changing the master IP is not so simple and can cause a number issues including such in Services and Pod network plugins. it can disrupt workloads, controllers and the cluster operation as a whole. this is not a kubeadm only problem, but a k8s problem where too many modules can depend on a hardcoded IP. new users should be really careful when the pick the cluster endpoint! we document this here: picking a domain name is highly advised. running a local network DNS server with CNAME is an option. another option is to use mapping in existing users of single CP nodes IP endpoints, that wish to move to a new IP or to using a control-plane-endpoint FQDN can follow the manual guides the k8s community created, but be wary that such a guide may not cover all the details of their clusters. if someone is willing to work on a documentation PR for the k8s website on this topic, please log a new k/kubeadm issue with your proposal and let's discuss it first. thanks! /close |
@neolit123: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this a BUG REPORT or FEATURE REQUEST?
/kind bug
/area HA
Versions
kubeadm version: v1.15.6
Environment: Dev
What happened?
kubelet.conf on additional control plane nodes created with kubeadm are configured to connect to the apiserver of the initial master instead of the one on localhost or through the load balancer. This has the consequence of all the kubelets becoming NotReady if the first master is unavailable.
Nodes used in the examples:
This example joins against the load balancer:
Checking the results:
And this example joins directly against the initial master:
Checking the results:
So in both cases kubelet.conf is configured against the initial master, while admin.conf + controller-manager.conf + scheduler.conf are all configured against the load balancer.
What you expected to happen?
kubelet.conf should have been configured to use the load balancer or local apiserver:
I'm not sure what is best practice here. Would it make sense for the kubelet on a master to be ready if the apiserver on localhost is unavailable (if configured to use load balancer)?
How to reproduce it (as minimally and precisely as possible)?
Have not tested using the guide at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ step by step, since we need to convert existing single master clusters to multi master. So it might work correctly if done according to instructions on completely new clusters. These steps are how we add more masters to our existing cluster:
kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml
rm -rf /etc/kubernetes/pki/apiserver.*
kubeadm init phase certs apiserver --config=/etc/kubernetes/kubeadm-config.yaml
--control-plane
nodes like in the examples further upNotReady
.Anything else we need to know?
It's an easy manual fix. Just edit the ip address in kubelet.conf. Need to do so on the workers as well. But since kubeadm already configures the other .conf-files correctly on the new masters it seems reasonable to expect kubelet.conf to be configured correctly as well. Or maybe there is some parameter I'm missing somewhere to get it right.
The text was updated successfully, but these errors were encountered: