-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node Archive and Restore #2250
Comments
hi @briantopping , kubeadm's primary goal is to be a k8s cluster bootstrapper that creates a minimal viable cluster. with respect to this FR, kubeadm is already overstepping the Unix philosophy in a couple of places:
big changes in kubeadm require a KEP: my initial comments here are the following:
@kubernetes/sig-cluster-lifecycle |
Hi @neolit123, thanks for the comprehensive feedback. I agree with your positions here. Taking on inappropriate scope is a good way to break a project. Nobody wants that. In keeping with your ideas, I believe this FR is in scope, but only because the implementation of the feature outside of kubeadm is still version locked to changes that occur in kubeadm over time. There is no doubt that an in-tree feature would add to the weight of a release, but users would be sure that for any given release and deployment, disaster recovery operations were dependable. This is also very UNIXy, the external tool that we depend on to know how to back up a kubeadm-generated cluster is actually the kubeadm present on the node. I think this perspective can be validated by your comment about In light of your feedback, would you agree the determination of whether "dependable archive functionality is valuable" should take place before a consideration of the packaging (operator, krew, in-tree kubeadm, etc)? Captured as a KEP would be a good exercise in filtering broken assumptions, refining requirements as well as gauging value. Any one of those three could fail the effort. If the effort gains momentum, we should have a better idea by that point what the packaging should look like and why. |
an important aspect here is that directories like /var/lib/kubelet or /var/lib/etcd are not really maintained by kubeadm, they are maintained by the kubelet and etcd. while kubeadm would be closer to knowing what is in there than any external tool, it would face the same issues of making maintainers look at the source code of kubelet/etcd determining "what changed". on the other hand just always archiving their contents might no be desired.
it feels like supporting detailed backup/archive would require user level configuration - e.g. making it possible to enumerate paths and skip sub-paths. the default paths to be archived become debatable and could trigger a number of change requests by users...
i don't think many would argue against the value of backup/restore, i'm more interested in enumerating the locations that would be backed up, restored and the level of customization the users will have and where would this tooling live.
there are multiple levels to KEPs. for example here is the place for generic sig-cluster-lifecycle KEPs: here are kubeadm specific KEPs: but like i said, ideally this FR should get some +1 comments from the maintainers, before going into KEP form. |
I'm really debated about this feature. From one side I understand the user's need for a something that takes charge of the whole problem. Other things that makes me lean to a -1 to get this in kubeam are:
IMO this feature - o more generically an HA/DR plan - should be part of higher-level tools in the stack like Cluster API, kubespray, Kops because those tools have control of the full stack, thus allowing to clearly define the scope to account for. |
I'm okay with this not being a kubeadm feature and I don't want to waste valuable team time if this is the wrong team for it. Super grateful for the input so far and pleased it's not a solid -1 off the bat. That said, I can't imagine how many people have said "OH FSCK!" as they acknowledged Anyway, as I thought through the problem, I started to realize that a minimal (non-transitive) archive of the content deleted by And that's where the thought process strayed to how the "knobs and dependencies" in kubeadm are very different than other deployers. Kubeadm knows that it installed stacked vs external etcd, a consideration that is important to getting a transactionally stable archive. Kops is going to have some AWS config. Is it good scope for the installers to create that stable archive of minimum viable cluster as normative functionality? I do appreciate that fully restoring a node is a transitive closure problem, and I didn't mean to imply this archive should do that, at all. It would be impossible to track CRs that create local resources, for instance. It started as above, no more, no less. Last thing I will add since Kubespray and Kops are mentioned: There's a user base that runs complex clusters on bare metal and kubeadm is the most reliable tool to do that with. If it feels to the team that it's a MVP reference bootstrapper, community perspective might surprise you. It's really the best thing going and is an essential peer of Kubespray and Kops. So something that they should be doing, kubeadm should arguably also be doing. And I get that might mean the Cluster API should be facilitating that, removing that responsibility from installers instead of having each of them rebuild the same functionality. One way to do that would be to leave ConfigMap objects in scope so the Cluster API could provide this archive-like functionality across all installs. |
@briantopping those are valuable comments. thanks The first action item is to consider if we can add an additional sanity check on top of kubeadm reset, asking for a second confirmation in case the action is potentially destructive for the cluster (reset of a control plane node) |
one might as well call
for the minimal / recommended number of CP nodes in a HA cluster - 3, removing 1 node falls under the accepted failure tolerance of etcd: thus, for an HA CP it would be easier to recover from a
[reset] Are you sure you want to proceed? [y/N]: i don't think we should be adding yet another one.
i wouldn't call executing the following simple sudo tar -cf ~/k8s-archives/somearchive.tar /etc/kubernetes /var/lib/kubelet /var/lib/etcd it would be the equivalent of a best effort there are other caveats around this feature. another problem here is archiving at a time the kubelet is rotating it's client/serving certificates. if the archive takes an expired certificate then at kubelet restart the kubelet will fail to authn with the api-server resulting in a requirement for the node to re-bootstrap using new credentials (token/certs). i.e. admin intervention. these are reasons adding to the argument that this should be a responsibility higher up the stack, while the customizable low level archival mechanism itself could be anything (e.g. |
Oh, is that the problem? 😂🤣👍🎉 The problem is Sometimes I have trouble parsing whether you are serious or having fun being snarky. I'm trying to help out here. I'm not some dumbshit and we're both too busy to waste time with why something is impossible. Others who aren't quite as comfortable admitting their failures are probably also running into the problem. I think I know you like to have fun and are way smarter than me on a most of this stuff from our last interaction, but I want to be careful. Hopefully that's 'nuf said. 🙏 Your excellent list of issues just bolsters the need for this FR. It's an improvable cluster management experience to have to know all this to own a kubeadm-based cluster, and it likely changes over time. A lot of k8s users are generalists and have a much wider scope than just k8s knowledge. How can we empower them to work confidently and efficiently on clusters? Crashing clusters is "not efficient" and will push people away when they might have actually been just fine without that one issue... |
that is certainly not my intention. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
FEATURE REQUEST
Preface: There is no substitute for proper backup and restore hygiene.
This feature request is for cluster backup and restore functionality as a part of
kubeadm
. Cluster deployment tools have unique knowledge of their behaviors and files that are both common and unique to a cluster. In disaster recovery situations, time is of the essence, and a faster automated recovery can be very valuable. While file-by-file backups of an OS root are feasible, efficiencies can be gained with cloud-init based images if the archiver can cherry-pick only the application level files necessary to restore. It is dangerous to expect users will track the changes of deployment layouts over time and quite simple if they know they can script a tool that will know how to do so.Implementation is envisioned in three parts:
kubeadm archive
generates an archive that could be used by therestore
functionality. It would generate to a file specified on the command line, a generated filename in/tmp
or with storage providers such as S3.kubeadm restore
would restore a node to the condition it was in at the time of an archive generated above.kubeadm reset
would be modified to create an an archive by default.In all cases, of
archive
,restore
should match the current behavior ofkubeadm init
/kubeadm join
:etcd
snapshot in the archive generated indirectly bykubeadm reset
. So long as resources set up during the installation did not change (ip addresses, CRI, etc),kubeadm restore
could return the cluster to its previous state./etc/kubernetes
would allow it. Optional / future logic might recognize those peers are missing or damaged and allow a cluster archive to be hydrated without the peers and to a single-nodeetcd
(utilizing theetcd
snapshot). It is a non-requirement that a modified HA cluster (iekubectl remove node foo
afterarchive
offoo
) would allow the archived node to re-join.restore
where destination directories already exist should fail without making any changes.It is important to recognize that node-specific resources must be intact for a restore to be successful. A Local Persistent Volume is an excellent example of this, but it holds true for devices that might be attached by Rook, interface names or addresses, local hostname configuration, etc.
Use cases
kubeadm reset
is reversible if archive generation is not explicitly disabledThe text was updated successfully, but these errors were encountered: